Multi-people tracking has been studied for decades. It is already applied in many computer vision tasks. In this paper, the proposed framework performs frame to frame tracking and follows tracking-by-detection approach. Saliency detection is introduced to enhance multi-people tracking. It is performed on two layers in this method. Salient parts inside the human patch denote representative regions of the target, while parts around the target capture context information. Short term tracking of salient parts is applied along with data association. When data association with detections fails, the supporting models on-line learned are used to indicate the locations of targets based on the tracking results of salient parts. A Bayesian based method is used for mutual occlusion reasoning. Experiments are carried out on several public datasets to evaluate the proposed method. The experimental results show the promising performance of the proposed method compared with state-of-the-art works.