Visual tracking is a challenging task in many computer vision applications due to factors such as occlusion, scale variations, background clutter, and so on. In this paper, we present a robust tracking algorithm by representing the target at two levels: global and local levels. Accordingly, the tracking algorithm is composed of two parts: global and local parts. The global part is a discriminative model which separates the foreground object from the background based on holistic features. In the local part, we explore the target's local representation by a set of filters convolving the target region at each position. Then, the global part and local part are integrated into a collaborative model to construct the final tracker. Experiments on the tracking benchmark dataset with 50 challenging videos demonstrate the robustness and effectiveness of the proposed algorithm, outperforming several state-of-the-art models.