Structure from Motion (SfM), which reconstructs a 3D model from 2D image sequences, is one of the most important reconstruction techniques in computer vision domain. In recent years, the unordered large photo collections are available on the Web and many researches have been proposed about SfM using the images on the Web. However, computational cost for matching in these researches is very high. To make matching feasible, finding out image connectivity for SfM in advance is important. The bag-of-visual-words (BoVW) with applying standard term frequency-inverse document frequency (tf-idf) weighting is one of the widely used methods for calculating image pair similarity as a vector similarity. On the other hand, a method based on Jaccard similarity was proposed to use for this purpose as a set similarity, and improved the precision scores. In this paper, we present a novel set similarity called modified Simpson similarity, and a method for finding appropriate images to match in SfM by combining it with the method using tf-idf weighting. In our method, the similarities between images are calculated using tf-idf as the vector similarity and using modified Simpson similarity as the set similarity. After calculating pairwise similarity by each method, top k most similar images are selected for each image as a query, and we take the intersection of those predicted image pairs. For the evaluation of our method, experimental results are shown on the large dataset which is one of the benchmarks for this purpose. The result of our method which is a combination of tf-idf and modified Simpson similarity achieved better accuracy than any other results by tf-idf, other set similarity and combination of tf-idf and other set similarity.