期刊文献+

基于矩阵分解的异质信息网络聚类分析研究 被引量:4

Clustering Analysis in Heterogeneous Information Network Based on Matrix Factorization
下载PDF
导出
摘要 实际的网络化数据往往包含不同类型的节点和边,采用异质信息网络建模可以更加全面的包含交互对象和对象之间的关联,因此异质信息网络分析成为数据挖掘的研究热点.虽然同质信息网络中的聚类已经被深入研究,但是异质信息网络中的聚类还很少研究.异质信息网络中多类对象共存以及丰富的语义信息对聚类分析提出了新的挑战.本文研究异质信息网络中的聚类问题,并提出了一种基于矩阵分解的聚类方法 HeteClus.该方法首先利用HeteSim计算基于用户指定的语义路径的对象相似度矩阵;然后采用正交非负矩阵三因子化分解方法得到节点的软聚类或者硬聚类结果.人工和实际网络数据验证了方法的有效性,并通过实例阐明了矩阵分解的物理意义. The real networked data often contain different types of objects and links, and therefore it can be modeled with heterogene- ous information network, which includes more comprehensive interaction between the objects and contains rich semantics. Although the clustering on homogeneous networks, has been intensively studied, few works are done on heterogeneous networks. The multiple types of objects and rich semantics in heterogeneous information networks bring new challenges to the clustering analysis on it. In this paper, we study the clustering problem in heterogeneous networks, and propose a matrix factorization based clustering method:HeteClus. The method first uses HeteSim to build the similarity matrix based on user-guided path. Then the orthogonal non-negative matrix tri-factori- zation method is employed to get clustering results. Experiments on artificial and real networked data are performed to demonstrate the capability of this method, and case studies are given to show the physical meaning of the matrix decomposition results.
出处 《小型微型计算机系统》 CSCD 北大核心 2014年第10期2256-2261,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61375058 61074128 71231002 60905025)资助 国家重点基础研究发展规划项目(2013CB329603)资助 中央高校基本科研业务费专项资金资助
关键词 异质信息网络 聚类 非负矩阵分解 矩阵三因子化 语义相似度计算 heterogeneous information network clustering analysis nonnegative matrix factorization matrix tri-factorization semantic similarity search
  • 相关文献

参考文献15

  • 1Von Luxburg Ulrike. A tutorial on spectral clustering [ J ]. Statistics and Computing,2007,17(4) :395-416.
  • 2Yang Tian-bao,Jin Rong,Chi Yun,et al. Combining link and content for community detection: a discriminative approach [ C ]. Proceedings of Knowledge Discovery and Data Mining (KDD) ,2009: 927-935.
  • 3Long Bo, Zhang Zhong-fei,Yu Philip S. Co-clustering by block value decomposition [ C ]. AMC SIGKDD Conference on Knowledge Discovery and Data Mining,2005,635-640.
  • 4Gao Bin, Liu Tie-yan, Zheng Xin, et al. Consistent bipartite graph co-partitioning for star structured high-order heterogeneous data co-clustering[C]. AMC SIGKDD Conference on Knowledge Discovery and Data Mining,2005 ;41-50.
  • 5Sun Yi-zhou, Yu Yin-tao,Han Jia-wei. Ranking-based clustering of heterogeneous information networks with star network schema [ C]. AMC SIGKDD Conference on Knowledge Discovery and Data Mining,2009:797-806.
  • 6Sun Yi-zhou, Charu C Aggarwal, Han Jia-wei. Relation strength-aware clustering of heterogeneous information networks with incomplete attributes [ C ]. International Conference on Very Large Data Bases,2012,5(5) :394405.
  • 7Sun Yi-zhou,Norick B,Han Jia-wei,et al. Integrating meta-path selection with user-guided object clustering in heterogeneous information networks[C]. AMC SIGKDD Conference on Knowledge Discovery and Data Mining,2012; 1348-1356.
  • 8Wang Ran,Shi Chuan,Yu Philip S,et al. Integrating clustering and ranking on hybrid heterogeneous information network [ M ]. Advances in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg ,2013:583-594.
  • 9Sun Yi-zhou,Han Jia-wei, Yan Xi-feng,et al. Pathsim;meta path-based top-k similarity search in heterogeneous information networks [ C]. International Conference on Very Large Data Bases, ,2011.
  • 10Shi Chuan,Kong Xiang-nan,Yu Philip S. Relevance search in heterogeneous networks [ C ]. International Conference on Extending Database Technology,2012:26-30.

同被引文献11

引证文献4

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部