This paper proposes Hybrid Floating-Point Modules (HFPMs) as a method to improve software floating-point (FP) throughput without incurring the area overhead of hardware floating-point units (FPUs). The proposed HFPMs were synthesized in 65 nm CMOS. They increase throughput over a fixed-point software FP implementation by 3.6× for addition/subtraction, 2.3× for multiplication, and require less area than hardware modules. Nine functionally equivalent FPU implementations using combinations of software, hardware, and hybrid modules are synthesized and provide 1.07–3.34× higher throughput than a software FPU implementation, while requiring 1.08–12.5× less area than a hardware FPU for multiply-add operations.