We propose a technique for detecting pedestrians by employing stereo camera images and based on probabilistic voting. From a disparity map, each pixel on the image is voted on a depth map employing a 2-D Gaussian distribution. The region having the peak value in the vote is chosen as the foot of an object. The object is specified by a rectangle on the right image, which is referred to as the region of interest (ROI). This ROI is described by HOG features, and is judged by SVM if it contains a person. With an ROI containing a person, a Kalman filter is applied to track the person through successive image frames. The performance of the detection of people was evaluated by employing ground truth data. The ratio of people detected to the ground truth data, called the recall rate, was 80%. This is a satisfactory result.