摘要
在许多模式识别任务中,研究者常常使用有标记样本的信息,而忽略无标记样本信息,但在现实生活中有标记样本的获得可能需要花费大量的人力、物力、财力,而无标记数据的获得却相对容易得多。如何利用无标记的数据来增强分类器的性能成为近年来模式识别中的研究热点。在以往的半监督增强学习中,主要是根据无标记样本和有标记样本的相似度来利用无标记样本的,相似度主要使用欧氏距离来度量,而欧氏距离只反映样本间的空间位置关系,没有反映样本间的流形信息。因此,提出了基于测地距离的半监督增强学习算法,从而可以反映样本空间的流形信息。多个数据库上的实验结果表明提出算法的有效性。
In many pattern recognition tasks,people often use the labeled samples.But the labeled sample may be time consuming to obtain,and sometimes human effort is needed.Then it is expensive to get while unlabeled data is much cheaper to obtain.Therefore,utilizing unlabeled data to boost the classifier has received a significant interest in pattern recognition in recent years.In semi-supervised learning,the unlabeled data is taken into account by the similarity between unlabeled data and labeled data.In the usual semi-boosting,people use the Euclidean distance to compute the similarity.However,the Euclidean distance only reflects the spatial relationship and ignores the manifold information.So this paper presents a semi-supervised boosting algorithm based on the geodesic distance,and then the manifold information in the sample space is reflected.The experimental results on the public data sets reveal that the proposed method can get encouraging recognition accuracy.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第21期202-204,209,共4页
Computer Engineering and Applications
基金
国家自然科学基金No.60975083~~
关键词
测地距离
半监督学习
流形
增强
geodesic distance
semi-supervised learning
manifold
boosting