This paper comprises of new low power multiplication algorithm and VLSI architecture. The one less than previous is foundation to built the proposed algorithm. The algorithm is simple straightforward to find NxN unsigned binary number multiplication using 2n-1 constant number which is used recursively for both multiplicand and multiplier. It revealed that reusability of the hardware resource results low power consumption and better power delay product compared to conventional multiplier. Apply the knowledge of data flow graph for the mathematical expression of the proposed algorithm to construct a set of hardware block. The performance of the 256 bit multiplication yields a savings of 46.34% in power and 47.5% in power delay product over the existing conventional multiplier. Further, the art of optimization technique, like retiming approach is adopted to design fast multiplication. The proposed algorithm with retiming analysis results are compared with conventional multiplier as well as constant multiplier. The functional verification and sythesization of the proposed architecture is analyzed using cadence EDA tools and implemented using 45nm technology libraries. The retiming approach for the proposed algorithm gives 86.9% in power delay product compared to conventional method and a reduction of 75.05% in power delay product over the proposed work.