Most semi-supervised learning methods assume there are a number of labeled data available in order to learn a classifier which then exploits a large set of unlabeled data. However, for some applications, there are only extremely spare labeled examples attainable (say, one example per category). In this case, these semi-supervised learning methods can not work. In this paper, a new method for seeking more examples with high reliable labels based on the limited labeled data is proposed. By investigating the correlation between different views through canonical correlation analysis, our method can launch semi-supervised learning using only one labeled example from each class. Experiments on text classification show the effectiveness of the proposed method.
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.