Modular multiplication in Galois Fields - GF(p) and GF(2m ) is an ineluctable and time stumbling block in public key cryptosystems. Montgomery modular multiplication has emerged as a VLSI efficient implementation of this operation. In this paper, a new scalable and pipelined Montgomery multiplier architecture that unifies the two important finite fields, GF(p) and GF(2m), is presented. The proposed architecture has successfully reduced the slack of the Montgomery multiplication in GF(2m) without jeopardizing the timing of its operation in GF(p). Acceleration of multiplication in GF(2m) for all ranges of modulus and in GF(p) for higher precision modulus is made possible through a new dual field adder and processing unit which can be pipelined in a kernel. The proposed dual field adder has been optimized to operate in an existing architecture that has been retimed to overcome the conflicts for speeding up the pipelined architecture. The latency has been analytically formulated in terms of the input wordlength, modulus precision and number of pipeline stages to evaluate its total computation time. The processing unit has been implemented on FPGA and the experimental results show evidence of throughput rate and latency improvement over existing dual field processing unit