摘要
社区发现是数据挖掘的重点问题之一,社区发现有助于提取数据中隐含的信息和知识,对链路预测、关键节点识别以及个性化推荐等提供有力支撑。多源异构数据一般由多种类型的实体对象和关系组成,具有格式异构、语义多样等特征,异构信息网络可以更好地挖掘数据中的语义信息。针对如何充分利用异构信息网络中丰富的语义信息,提出一种融合多条元路径的社区发现算法。算法综合考虑多条不同元路径下的语义信息,组合多条元路径度量任意节点间的相似性,计算节点的重要性,选取种子节点进行拓展完成初始社区划分,基于种子社区进行标签拓展优化,最终完成社区划分。在真实数据集上进行仿真实验,实验结果表明该算法针对异构信息网络可以得到较好的社区发现结果。
Community detection is one of the key problems in data mining,and community detectionin intelligence network canhelp to extract the implicit information and knowledge in data and provide powerful support for link prediction,key node identification and personalized recommendation.Multi⁃source heterogeneous data generally consists of multiple types of entity nodesand relationships,and has the features of heterogeneous formats and semantic diversity.The heterogeneous information networks can mine the semantic information in the datamuch better.In order to make full use of the rich semantic information in heterogeneous information networks,a community discovery algorithm that fuses multiple meta⁃paths is proposed.In the algorithm,the semantic information in multiple meta⁃paths is considered,multiple meta⁃paths are combined to measure the similarity between any nodes,the importance of nodes is calculated for selection of seed nodes to expand and complete the initial community segmentation,and the label expansion is optimized based on the seed communities to complete the community segmentation.The simulation experiments were performed on datasets.The experimental results show that the algorithm can get better community discovery results for heterogeneous information networks.
作者
卢兴文
段同乐
李祥民
LU Xingwen;DUAN Tongle;LI Xiangmin(th Research Institute of China Electronic Technology Group Corporation,Shijiazhuang 050081,China)
出处
《现代电子技术》
2023年第7期79-84,共6页
Modern Electronics Technique
关键词
社区发现算法
应用研究
异构信息网络
数据挖掘
元路径融合
语义信息
社区划分
community discovery algorithm
applied research
heterogeneous information network
data minning
meta⁃path fusion
semantic information
community division