期刊文献+

异质信息网络表征学习综述 被引量:6

Heterogeneous Information Network Representation Learning:A Survey
下载PDF
导出
摘要 随着信息技术的快速发展,信息网络无处不在,例如社交网络、学术网络、万维网等.由于网络规模不断扩大以及数据的稀疏性,信息网络的分析方法面临巨大挑战.作为应对网络规模及数据稀疏挑战的有效方法,信息网络表征学习旨在利用网络的拓扑结构、节点内容等信息将节点嵌入到低维的向量空间中,同时保留原始网络固有的结构特征和内容特征,从而使节点的分类、聚类、链路预测等网络分析任务能够基于低维、稠密的向量完成.由多种类型的节点和连边构成的异质信息网络包含更加全面、丰富的结构和语义信息,因此异质信息网络的表征学习不仅能够有效缓解网络数据高维、稀疏性问题,还能融合网络中不同类型的异质信息,使学习到的表征更有意义和价值.近年来,异质信息网络的表征学习受到学术界和工业界的广泛关注,成为网络分析的一个重要研究主题,研究成果不断涌现.然而,目前还缺乏对现有成果进行全面梳理的工作,相关研究人员难以系统地了解最新研究进展,在实际应用中也难以选择合适的嵌入模型.为此,本文对异质信息网络表征学习的方法进行了全面综述,包括相关概念、网络分类、学习方法、数据集与测评指标、典型应用,同时对未来的研究方向进行了展望.本文工作有助于研究人员全面系统地了解异质信息网络表征学习的研究进展,也有助于从业人员更有效地解决实际应用问题. Information networks are ubiquitous in real world,such as social networks,academic networks and the World Wide Web,etc.With the rapid development of information technology,methods for analyzing information networks face huge challenges due to the ever-expanding network scale and the sparseness of data,which requires that the analysis methods have good scalability and can effectively solve the problems caused by data sparsity.As an effective way to deal with these challenges,information network representation learning aims to embed nodes or links into a low-dimensional vector space by using information such as network topology and node content,while preserving the intrinsic structural and content features of the original network,so that network analysis tasks,such as node classification,clustering and link prediction,etc.can be completed based on low-dimensional dense vectors.The heterogeneous information network,consisting of multiple types of nodes and links,contains more comprehensive and rich structure and semantic information.Therefore,the representation learning of heterogeneous information networks can not only effectively alleviate the high-dimensional and sparsity problems of network data,but also integrate different types of heterogeneous information into the same vector space,which makes the learned features more meaningful and valuable.Recently the research on the representation learning and application of heterogeneous information networks has drawn increasing attention in both academia and industry and has been an important research topic in network analysis area,and a lot of research results on the representation learning of heterogeneous information networks have been proposed.However,the lack of detailed and comprehensive survey of existing work makes it difficult for researchers to systematically understand the latest research progress,and for choosing appropriate embedding models and algorithms in practical applications.To this end,we conduct in this survey a thorough review of existing work on representation learning of heterogeneous information networks.We first introduce some basic concepts about heterogeneous information networks and network embedding learning,divide the heterogeneous information networks into nine categories(structural,attribute,multi-layer,multi-view,multiplex,attributed multiplex,multi-resolution,heterogeneous feature and dynamic heterogeneous networks)based on the heterogeneity sources,summarize the characteristics of various networks,and introduce the principle of the random walks and the negative sampling,two techniques usually used by many representation learning methods for heterogeneous information networks.Next,we comprehensively review the network representation learning methods for single,multiple and dynamic heterogeneous information networks,including random walks-based,decomposition-based and deep neural network-based methods.We also analyze the characteristics of different methods,such as used information and technique,embedded object,representation form,and time complexity etc.Then,we summarize data sets and evaluation metrics generally used by various embedding learning methods for heterogeneous information networks.In addition,several typical applications of embedding learning for heterogeneous information networks,i.e.author identification,recommendation and sentiment link prediction are presented in detail.At last,we reveal the research directions related to the embedding learning for heterogeneous information networks in the future.This survey does not only help researchers to have a better understanding of existing work,but also help applicants to better solve practical application problems.
作者 周丽华 王家龙 王丽珍 陈红梅 孔兵 ZHOU Li-Hua;WANG Jia-Long;WANG Li-Zhen;CHEN Hong-Mei;KONG Bing(School of Information Science and Engineering,Yunnan University,Kunming 650500)
出处 《计算机学报》 EI CAS CSCD 北大核心 2022年第1期160-189,共30页 Chinese Journal of Computers
基金 国家自然科学基金(62062066,61762090和61966036) 云南省大学创新计划(IRTSTYN) 国家社会科学基金(18XZZ005) 云南省高校物联网技术及应用重点实验室和云南大学研究生科研创新基金项目(2021Y024)的资助.
关键词 异质信息网络 表征学习 随机游走 负采样 深度神经网络 heterogeneous information network representation learning random walks negative sampling deep neural network
  • 相关文献

参考文献8

二级参考文献12

共引文献186

同被引文献20

引证文献6

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部