Exploring semantic similarity between concepts in visual domain has a wide range of applications such as natural language processing and multimedia retrieval, which in general requires both a large pool of sample images for each concept and a model to capture its visual characteristics. Instead of relying on high quality and large quantity sample data which is very difficult to obtain, in this paper, a novel method is proposed to achieve improvement in measuring concept similarity by incorporating concept modeling technique into data pruning process. At first, a number of sampling concept models are obtained by sampling a subset from the sample dataset of each concept. Then noisy samples are discarded in terms of their probabilities to the sampling concept models. Experimental results on 31,275 Web images of 38 concepts defined in LSCOM indicate that the concept similarity obtained through our proposed approach is more coherent to human cognition. A concept hierarchy tree built from the 38 concepts with their similarity further demonstrates the effectiveness of our proposed method.