This paper describes key circuit innovations in a new x86-64 micro-architecture <citerefgrp><citeref refid="ref1"/></citerefgrp> AMD code-named “Bulldozer” <citerefgrp><citeref refid="ref2"/> </citerefgrp>, <citerefgrp><citeref refid="ref3"/></citerefgrp>. It is implemented in 32 nm high-K metal gate SOI CMOS. It occupies 30.9 mm , contains 213 million transistors, reduces the number of F04 gates per cycle by more than 20% compared to a previous processor in the same technology <citerefgrp> <citeref refid="ref4"/></citerefgrp>, and demonstrates superior frequency scaling across voltage. The module includes two independent integer cores but shares the fetch, decode, floating-point, and L2 cache units to maximize single-threaded performance and multi-threaded throughput while significantly improving power and area efficiency compared to fully replicated CPU cores. The design includes a new soft-edged flop (SEF) family to enable high frequency and low power. Achieving power efficiency in combination with high-frequency design is a particular challenge, and this paper describes several of the unique approaches to power optimization that have been employed in the design. The gate-count reduction and power optimization enable faster frequencies in the same power envelope compared to previous designs.