期刊文献+

一种网页分类中基于图的半指导学习算法

Study of Web page classification based on graph-based semi-supervised learning
下载PDF
导出
摘要 提出一种基于图的半指导学习算法用于网页分类。采用k近邻算法构建一个带权图,图中节点为已标志或未标志的网页,连接边的权重表示类的传播概率,将网页分类问题形式化为图中类的概率传播。为有效利用图中未标志节点辅助分类,结合网页的内容信息和链接信息计算网页间的链接权重,通过已标志节点,类别信息以一定概率从已标志节点推向未标志节点。实验表明,本文提出的算法能有效改进网页分类结果。 This paper proposed a graph-based semi-supervise learning method, and applied to the Web document classification. Used k-nearest neighbor algorithm to construct a weighted graph with edge weights representing the similarity between the nodes, and the nodes in the graph were labeled and unlabeled Web pages. In order to use unlabeled data to help classification and get higher accuracy, computed edge weights of the graph through combining weighting schemes and link information of Web pages. By using probabilistic matrix methods and belief propagation, the labeled nodes pushed out labels through unlabeled nodes. The learning problem was then formulated in terms of label propagation in a graph. Experiments on the WebKB dataset indicate that the graph-based semi-supervise learning method can improve the effectiveness of Web document classification.
作者 刘蓉 周建中
出处 《计算机应用研究》 CSCD 北大核心 2008年第3期735-737,共3页 Application Research of Computers
基金 国家自然科学基金资助项目(50579022,50539140)
关键词 图模型 半指导学习 网页分类 链接信息 graph model semi-supervised learning Web page classification link information
  • 相关文献

参考文献7

  • 1SHAHSHAHANI B, LANDGREBE D. The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon [ J]. IEEE Trans on Geoscience and Remote Sensing, 1994,32 (5) : 1087-1095.
  • 2YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods [ C ]//Proc of the 33rd Annual Meeting of the Association for Computational Linguistics. 1995 : 189-196.
  • 3BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training [C ]//Proc of the l lth Annual Conference on Computational Learning Theory. Madison : ACM Press, 1998:92-100.
  • 4ZHOU Zhi-hua, LI Ming. Tri-training: exploiting unlabeled data using three classifiers [ J]. IEEE Trans Knowledge and Data Engineering ,2005,17 ( 11 ) :1529-1541.
  • 5JOACHIMS T. Transductive inference for text classification using support vector machines [ C ]//Proc of the 16th International Conf on Machine Learning. San Francisco:Morgan Kaufmann,1999:200-209.
  • 6ZHU X J. Semi-supervised learning with graphs [ D]. [ S.l. ] :Camegie Mellon University, 2005.
  • 7HUANG T M,KECMAN V. Semi-supervised learning from unbalanced labeled data : an improvement [ C ]//Knowledge Based and Emergent Technologies Relied Intelligent Information and Engineering Systems, Lecture Notes on Computer Science 3215. Heidelberg: Springer-Verlag, 2004:765-771.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部