Computational time and hardware resource are key issues in hardware implementation of any signal-processing algorithm. This paper presents the design and implementation of a polyphase-decomposition-based new architecture of wavelet filter for power system harmonics estimation using discrete wavelet packet transform (DWPT). Usually, DWPT provides coefficients as the output; however, the proposed architecture also includes provision for providing root mean square values directly. The proposed method reduces computational requirements and save memory resources. Xilinx system generator, a higher abstraction level tool, has been used to simulate and implement the proposed scheme on the Xilinx Artix-7 field-programming gate array AC701 board. Performance of the proposed architecture has been validated and compared through hardware cosimulation with variety of synthetic and experimental signals.