In this paper, we present a CMOS image sensor architecture coupling a spatial light modulator to a photodiode, for medical imaging based on acousto-optical coherence tomography with a digital holographic detection scheme. Our architecture is able to measure an interference pattern between a scattered beam transmitted through a scattering media and a reference beam, on an array with 16 μm pixel pitch, at 4000 Hz, which is compliant with correlation time of breast tissues. In-pixel processing allows generating from the incident light, a signal to polarize an embedded light modulator used to control the phase of the reflected beam. This reflected beam can then be focused on a region of interest of a scattering media, for therapy. The stacking of a photosensitive element with a spatial light modulator on the same device brings a significant robustness over the state of the art techniques such as perfect optical matching and reduced time delay in controlling light.