Blind source separation can be implemented in the frequency domain using one-tap multiplication operation in each frequency bin, but only when the frame length is long enough to disregard temporal aliasing effects. If we take a short-time frequency transformation with a window shorter than a room reverberation time, the justification above does not hold anymore. In this paper, we present an appropriate representation in the short-time frequency domain. The suitability is justified by showing the equivalence with the original time domain approach under the overlap-add context. Experimental validation using a corpus synthesized by convolution with measured sets of room impulse responses is also provided.