This paper describes a project undertaken to simplify the implementation of high-throughput, low-power, numerically intensive applications on Virtex platforms. The system is a pipeline composed of block floating point processing elements. These combine the advantages of fixed-point and floating-point implementations: improved data accuracy (compared to fixed-point) while keeping the hardware resources to a minimum. The design is based on the DSP48E, a digital signal processing element or slice provided on certain Xilinx Virtex FPGAs. This implementation approach yields high throughput (550 MHz) and low power consumption. The key to this is implementing high-performance shifting and rounding modules in Virtex-5 platforms. The implementation can be easily ported to Virtex-6 and 7 Series FPGA families.