Recent research in brain machine interface (BMI) has shown that cortical implants can record and wirelessly transmit neural activity to external workstations for further processing, spike sorting, and decoding. In order to reduce complexity, bandwidth, and power consumption of such systems we introduce a miniaturized real-time spike sorting VLSI architecture that is to very low signal-to-noise ratios (SNR). This completely eliminates any external spike sorting dependencies, thus, bringing the entire system one step closer to be all integrated and fully implanted. The algorithm used in this architecture exploits three features to achieve better classification and real-time sorting: the spatial neuronal distribution across electrodes, the temporal and spectral information in the spike waveforms from individual neurons, and hardware limitations imposed by the size of the implant.