摘要
提出一种基于密度中心图的弱监督分类方法,利用少量已标注样本,结合大量未知模式样本进行弱监督学习。借助样本空间的密度信息,求出密度中心点来准确地反应数据的空间几何特征,在此基础上建图,利用标记传递方法,使得相似的顶点尽可能赋予相同的类别标记。该方法具备基于图的弱监督算法的良好数学基础,可以发现任意形状的类,对噪音不敏感。并且该方法具有近线性的时间复杂度,更适合处理大规模的数据。将该方法用于UCI机器学习数据集,实验证明,该方法能获得较好的分类效果。
A density center graph based weakly supervised classification algorithm is presented. It learns from limited observational data and a large number of unlabelled data. It works by using point of density center which captures the shape and extent of a dataset. Then the right label is given to the data by using the label propagation algorithm. This algorithm is based on mathematical foundation, therefore, it can discover classes with arbitrary shape and is insensitive to noise data. It is efficient when it faces with large scale data because of its linear time complexity. The experiments prove it has those good features mentioned above.
出处
《计算机工程与应用》
CSCD
北大核心
2015年第6期6-10,共5页
Computer Engineering and Applications
基金
国家自然科学基金(No.61373117)
西安邮电大学青年基金(No.103-0457)
关键词
弱监督学习
分类
密度
数据挖掘
weakly supervised learning
classification
density
data mining