Foreground object detection is a crucial technique of intelligent surveillance systems, and it is still a challenging problem in complex scenes with illumination variations and dynamic backgrounds. Intuitively, the foreground object pixels are often not sparsely distributed but tend to be clustered. Motivated by this hypothesis, we present a new structured sparse model to extract foreground objects, which introduces the spatial neighborhood information into a unified optimization framework by l1,2 mixed norms. Simultaneously, we also give the solving method of the proposed model in details. Moreover, we apply the model to the sparse signal recovery and background subtraction in videos. In the experiments, better performance is obtained over previous methods. The experimental results validate the hypothesis and the effectiveness of the proposed method.