Non-binary low-density parity-check (NB-LDPC) codes have some advantages as opposed to their binary counterparts, but unfortunately their decoding complexity is a significant challenge. Hence, iterative hard-reliability-based majority-logic decoding (IHRB-MLGD) algorithms are attractive for NB-LDPC codes due to their low complexities. In this paper, we propose a layered improved iterative hard-reliability-based majority-logic decoding algorithm and design a partly parallel architecture for the proposed algorithm. Our improved algorithm achieves better error performance and faster convergence than existing IHRB-MLGD algorithms, while maintaining low complexities. The proposed partly parallel architecture achieves a throughput of 779 Mbps with SMIC 0.13um CMOS technology.