This paper presents FPGA implementation of retimed high speed adaptive filter structures for speech enhancement. In this work, various high speed adaptive filtering structures for noise cancellation are implemented for Xilinx Spartan-6 series and Virtex-4 series FPGA platforms. It has been observed that various VLSI implementations vary considerably in clock speed, hardware requirements, latency and cost. For instance, for the Spartan-6 series FPGA platform implementation, the clock speed of retimed DF-RDLMS implementation is found to be 98.309 MHz whereas that of conventional unretimed DF-LMS structure is 85.485 MHz, thereby having an improvement of 15% in clock speed. Similarly, for the Virtex-4 series FPGA platform implementation, the clock speed of retimed DF-RDLMS implementation is found to be 88.176 MHz whereas that of conventional unretimed DF-LMS structure is 75.855 MHz, thereby having an improvement of 16.5% in clock speed. The VLSI implementation and the performance analysis provides crucial information about an algorithm structure such as hardware requirements, power consumption and real-time performance. Performance of the implemented structures have been checked in terms of operating frequency, maximum combinational path delay, latency, and power consumption.