摘要
为了提高文本分类性能,提出一种基于受限约束范围标签传播的半监督学习算法。首先利用相似性矩阵计算得出概率转移矩阵,进而通过概率转移矩阵得出受限约束范围;然后在约束范围内利用半监督学习框架下的标签传播算法计算基于路径的相似性,路径相似性决定了标签传播的重要路径。由于只使用几条重要的传播路径,使得算法中省去计算每一条路径的相似度,计算复杂度大大减少。最终使得标签在带标签数据与未标签数据之间通过几条重要的路径之间传播。实验已经证明此算法的有效性。
This paper presented a semi-supervised learning algorithm based on label propagation in a constrained range. First of all, it obtained the probability transition matrix by calculating the similarity matrix, and then detected the constrained region. Then it adopted a label propagation algorithm under the semi-supervised learning framework to compute path similarity, which determined several important paths of label propagation. As only it calculated a few important propagation path, therefore greatly reduced the computational complexity. The labels spread in a few important paths between the labeled data and the unlabeled data. Experiments demonstrate the effectiveness of this algorithm.
出处
《计算机应用研究》
CSCD
北大核心
2016年第8期2303-2306,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(61363058,61163039)
甘肃省青年科技基金资助项目(145RJYA259)
甘肃省自然科学基金资助项目(145RJZA232)
中国科学院计算技术研究所智能信息处理重点实验室开放基金资助项目(IIP2014-4)
西北师范大学2013年度青年教师科研能力提升计划项目(NWNU-LKQN-12-23)
关键词
概率转移矩阵
受限约束范围
标签传播
半监督学习算法
probability transition matrix
constrained region
label propagation
semi-supervised learning algorithm