Timely, accurate and complete transits data are the prerequisite of improving public transportation query system service level. It will generate a lot of redundant data by using the GPS terminal to collect transit site data, due to differences in the location of the same name site and the existing GPS system deviation. Therefore an improved K-means clustering algorithm was proposed, which was applied into clustering analysis of transit data with the same site name but different location. Experimental results show that the algorithm is effective and clustering results accord with the actual situation.