We present a strategy based on human gait to achieve efficient tracking, recovery of ego-motion and 3-D reconstruction from an image sequence acquired by a single camera attached to a pedestrian. In the first phase, the parameters of the human gait are established by a classical frame-by-frame analysis, using an generalised least squares (GLS) technique. The gait model is non-linear, represented by a truncated Fourier series. In the second phase, this gait model is employed within a “predict-correct” framework using a maximum a posteriori, expectation maximization (MAP-EM) strategy to obtain robust estimates of the ego-motion and scene structure, while continuously refining the gait model. Experiments on synthetic and real image sequences show that the use of the gait model results in more efficient tracking. This is demonstrated by improved matching and retention of features, and a reduction in execution time, when processing video sequences.