Low-Latency, Low-Area, and Scalable Systolic-Like Modular Multipliers for $GF(2^{m})$ Based on Irreducible All-One Polynomials

Pramod Kumar Meher; Xin Lou

doi:10.1109/TCSI.2016.2614309

Low-Latency, Low-Area, and Scalable Systolic-Like Modular Multipliers for $GF(2^{m})$ Based on Irreducible All-One Polynomials

Meher, P.K., Lou, X.

Źródło

IEEE Transactions on Circuits and Systems I: Regular Papers > 2017 > 64 > 2 > 399 - 408

Abstrakt

In this paper, an efficient recursive formulation is suggested for systolic implementation of canonical basis finite field multiplication over

$GF(2^{m})$

based on irreducible AOP. We have derived a recursive algorithm for the multiplication, and used that to design a regular and localized bit-level dependence graph (DG) for systolic computation. The bit-level regular DG is converted into a fine-grained DG by node-splitting, and mapped that into a parallel systolic architecture. Unlike most of the existing structures, it does not involve any global communications for modular reduction. The proposed bit-parallel systolic structure has the same cycle time as that of the best existing bit-parallel systolic structure <xref ref-type="bibr" rid="ref1">[1]</xref>, but involves significantly less number of registers. The proposed bit-parallel design has a scalable latency of

$l+\lceil \log _{2}s\rceil +1$

cycles which is considerably low compared with those of existing systolic designs. Moreover, the proposed time-multiplexed structure is designed specifically for scalability of throughput and hardware-complexity to meet the area-time trade-off in resource-constrained applications while maintaining or reducing the overall latency. The ASIC synthesis report shows that the proposed bit-parallel structures offers nearly 30% saving of area and nearly 38% saving of power consumption over the best of the existing AOP-based systolic finite field multiplier.