In this paper, we propose a new spin-transfer torque magnetic random access memory (STT-MRAM) bit-cell structure (with complementary polarizers) that is suitable for on-chip caches. Our proposed structure requires a lower average critical write current than standard STT-MRAM, with improved write-ability, readability, and reliability. A cache array based on our proposed structure is studied using a device/circuit simulation framework, which we developed for this paper. Simulation results show that at the bit-cell level, our proposed structure can achieve subnanosecond sensing delay and lower read disturb torque using a self-referenced differential READ operation. Sensing and disturb margins of our proposed cell are \(1.8\times \) and \(2.4\times \) better than standard STT-MRAM, respectively. Furthermore, near disturb-free READ operation at ≥1.5 GHz is achieved using a latch-based sense amplifier and verified in circuit simulations. In addition, content addressable memory may also be efficiently implemented using complementary polarizer spin-transfer torque (CPSTT). Transient SPICE simulations show that CPSTT may be suitable for L1 cache, with a read energy of 14 fJ/bit. System level simulation shows that a CPSTT-based L2 cache can achieve \(\sim 9\) % lower energy consumption and >9% improvement in instructions per cycle over a standard STT-MRAM-based cache.