In this paper, a stereo framework for a robust real time localization of objects using networkpsilas camera pairs is presented. The stereo system contains a combination of static and pan-tilt-zoom (PTZ) cameras instead of traditional dual head mounted cameras. The proposed novelty consists in applying stereo vision to heterogeneous cameras belonging to a video-surveillance network. First, a look-up-table (LUT) is built with the rectification transformations computed for some predefined pan and tilt values. Then, the LUT is used to compute rectification transformations by means of neural networks for any arbitrary pan and tilt settings. Different zoom levels are compensated by resizing images according to their focal ratio and by applying zero padding. Localization of any object is made using its 3D position information obtained by a modified stereo concept. Experimental results are presented for the localization of moving objects in a parking lot scenario.