This paper presents a novel method for online real-time content-based image training and retrieval. The method relies on bags-of-words with SIFT features, and data can be extracted from a generated large scale vocabulary tree to describe all kind of images. The large-scale vocabulary tree can be seen as a code book that new images can be described. We use a large-scale vocabulary tree to generate vectors for new images, and compare the similarity of vectors between the database and query is a feasible way to achieve retrieval. Experimental results prove that the proposed method can achieve a good performance.