Thousands of deep and wide pipelines working concurrently make GPGPU high power consuming parts. Energy-efficiency techniques employ voltage overscaling that increases timing sensitivity to variations and hence aggravating the energy use issues. This paper proposes a method to increase spatiotemporal reuse of computational effort by a combination of compilation and micro-architectural design. An associative memristive memory (AMM) module is integrated with the floating point units (FPUs). Together, we enable fine-grained partitioning of values and find high-frequency sets of values for the FPUs by searching the space of possible inputs, with the help of application-specific profile feedback. For every kernel execution, the compiler pre-stores these high-frequent sets of values in AMM modules - representing partial functionality of the associated FPU- that are concurrently evaluated over two clock cycles. Our simulation results show high hit rates with 32-entry AMM modules that enable 36% reduction in average energy use by the kernel codes. Compared to voltage overscaling, this technique enhances robustness against timing errors with 39% average energy saving.