Recently, generic object recognition (automatic image annotation) that achieves human-like vision using a computer has being looked to for use in robot vision, automatic categorization of images, and retrieval of images. For the annotation, semi-supervised learning, which incorporates a large amount of unsupervised training data (unlabeled data) along with a small amount of supervised data (labeled data), is expected to be an effective tool as it reduces the burden of manual annotation. However, some unlabeled data in semi-supervised models contains outliers that negatively affect the parameter estimation on the training stage. Such outliers often cause the over-fitting problem especially when a small amount of training data is used. In this paper, we propose a practical method to prevent the over-fitting in semi-supervised learning, suppressing existing outliers by sparse representation. In our experiments we got 4 points improvement comparing conventional semi-supervised methods, SemiNB and TSVM.