Bag of visual words (BoW) representation has recently demonstrated impressive levels of performance in image classification tasks and attracted great attentions in computer vision community. Original BOW represents an image as an orderless collection of local features, while disregards all information about the spatial layout of the features, leads to a limited descriptive ability. Spatial pyramid matching (SPM) approximates geometric layout by partitioning the image into increasingly fine sub-regions, and has become a standard procedure for image classification. In this paper, we use cooccurence matrix to study the spatial layout of visual words, then represent an image with visual words co-occurence matrix (VWCM). We evaluate the proposed method, BOW and SPM on two standard datasets, i.e., 15 scenes and Caltech-256, with equal experimental protocol. The results validate the performance of VWCM in image classification.