期刊文献+

基于查询—文档异构信息网络的半监督学习 被引量:2

Semi-supervised learning by constructing query-document heterogeneous information network
下载PDF
导出
摘要 基于图的半监督学习近年来得到了广泛的研究,然而,现有的半监督学习算法大都只能应用于同构网络。根据查询及文档自身的内容特征和点击关系构建查询—文档异构信息网络,并引入样本的判别信息强化网络结构。提出了查询—文档异构信息网络上半监督聚类的正则化框架和迭代算法,在正则化框架中,基于流形假设构造了异构信息网络上的代价函数,并得到该函数的封闭解,以此预测未标记查询和文档的类别标记。在大规模商业搜索引擎查询日志上的实验表明本方法优于传统的半监督学习方法。 Various graph-based algorithms for semi-supervised learning have been proposed in recent literatures. However, although classification on homogeneous networks has been studied for decades, classification on heterogeneous networks has not been explored until recently. The semi-supervised classification problem on query-document heterogeneous information network which incorporate the bipartite graph with the content information from both sides is consid- ered. In order to strengthen the network structure, class information of sample nodes is introduced. A semi-supervised learning algorithm based on two frameworks including the novel graph-based regularization framework and the iterative framework is investigated. In the regnlarization framework, a new cost function to consider the direct relationship between two entity sets and the content information from both sides which leads t'o a significant improvement over the baseline methods is developed. Experimental results demonstrate that proposed method achieves the best performance with consistent and promising improvements.
出处 《通信学报》 EI CSCD 北大核心 2014年第8期40-47,共8页 Journal on Communications
基金 国家自然科学基金资助项目(61173036)~~
关键词 异构信息网络 半监督学习 信息检索 点击日志 heterogeneous information networks semi-supervised learning information retrieval click-through data
  • 相关文献

参考文献20

  • 1SUN Y, YU Y, HAN J. Ranking-based clustering of heterogeneous information networks with star network schema[A]. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discov- cry and Data mining[C]. Paris, France, 2009. 797-806.
  • 2SUN Y, HAN J. Mining heterogeneous information networks: a struc- tural analysis approach[J]. SlGK.DD Explorations, 2012, 14(2):20-28.
  • 3BELKIN M, NIYOGI P, SINDHWANI V. Manifold regularization: a geometric framework for learning fi'om labeled and unlabeled exam- pies[J]. The Journal of Machine Learning Research, 2006, 7: 2399-2434.
  • 4ZHOU D, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[J]. Advances in Neural Information Processing Systems, 2004, 16:321-328.
  • 5LI X, WANG Y Y, ACERO A. Learning query intent from regularized click graphs[A]. Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Re- trieval[C]. Singapore, Singapore, 2008. 339-346.
  • 6WU W, LI H, XU J. Learning query and document similarities from click-through bipartite graph with metadata[A]. Proceedings of the Sixth ACM International Conference on Web Search and Data Min- ing[C]. Roman, Italy, 2013.687-696.
  • 7CHEN Y, WANG L, DONG M. Non-negative matrix factorization for semisupcrvised heterogeneous data coclustering[J]. Knowledge and Data Engineering, 2010, 22(10): 1459-1474.
  • 8DENG H, HAN J, ZHAO B, et al. Probabilistic topic models with biased propagation on heterogeneous information networks[A]. Pro- ceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C]. San Diego, CA, 2011. 1271-1279.
  • 9DENO H, HAN J, LYU M R, et al. Modeling and exploiting hetero- geneous bibliographic networks for expertise ranking[A]. Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries[C]. New York, USA, 2012.71-80.
  • 10ZHOU Z H, LI M. Semi-supervised learning by disagreement[J]. Knowledge and Information Systems, 2010, 24(3): 415-439.

同被引文献9

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部