Key points are not available for this paper at this time.
Recently there has been signi cant i n terest in supervised learning algorithms that combine labeled and unlabeled data for text learning tasks. The co-training setting 1] applies to datasets that have a natural separation of their features into two disjoint sets. We demonstrate that when learning from labeled and unlabeled data, algorithms explicitly leveraging a natural independent split of the features outperform algorithms that do not. When a natural split does not exist, co-training algorithms that manufacture a feature split may out-perform algorithms not using a split. These results help explain why co-training algorithms are both discriminative in nature and robust to the assumptions of their embedded classi ers.
Nigam et al. (Mon,) studied this question.