摘要
为研究社区结构对网络表示学习的影响,提出了一种新颖的融合社区结构信息的网络表示学习算法(CINE)。通过借鉴模块度思想,将社区结构吸收到基于矩阵分解的模型中以保留网络内部的社区结构;设计一个整体的目标函数,在捕获社区结构信息的同时也融合了节点间的1阶2阶邻近性信息和节点的属性信息,最终得到包含原始网络中3类信息的节点表示;采用Cora、Citeseer和Wiki等3个公开网络数据集验证CINE在节点分类、链接预测和可视化任务中的表现。结果表明:在3个数据集的分类任务中,CINE的Micro-F1分数分别达到了0.9002、0.8402、0.7619,优于所有对比算法;在Cora数据集的链路预测任务中,CINE的AUROC得分比Node2vec、DeepWalk和TADW等算法分别提高了1.165、1.144和1.059倍。说明CINE在保留网络的结构和属性信息的基础上,捕获了社区结构信息,使得所学节点表示可以更好地执行后续的网络分析任务。
In order to study the influence of community structure on network representation learning,a novel network representation learning algorithm(CINE)that integrates community structure information is proposed.Specifically,by drawing on the idea of modularity,the community structure is absorbed into the model based on matrix factorization to preserve the community structure within the network.The algorithm designs an overall objective function that not only captures the community structure information,but also integrates the first-order and second-order proximity information between nodes and the attribute information of the nodes,and finally obtains a node representation containing three types of information in the original network.Three public network datasets such as Cora,Citeseer and Wiki are used to verify the performance of CINE algorithm in node classification,link prediction and visualization tasks.The results show that in the classification task of the three datasets,CINE′s Micro-F1 score reaches 0.9002,0.8402,0.7619,which is better than all comparison algorithms;in the link prediction task of the Cora dataset,CINE′s AUROC score is better than Node2vec,DeepWalk and TADW algorithm and is increased by 1.165,1.144 and 1.059 times,respectively.The experimental results show that CINE captures the community structure information on the basis of retaining the network structure and attribute information,so that the learned node representation can better perform subsequent network analysis tasks.
作者
刘彦北
刘金新
耿磊
王雯
LIU Yan-bei;LIU Jin-xin;GENG Lei;WANG Wen(School of Life Sciences,Tiangong University,Tianjin 300387,China;Tianjin Key Laboratory of Optoelectronic De-tection Technology,Tiangong University,Tianjin 300387,China;School of Electronics and Information Engineering,Tiangong University,Tianjin 300387,China)
出处
《天津工业大学学报》
CAS
北大核心
2022年第2期53-59,共7页
Journal of Tiangong University
基金
天津市自然科学基金资助项目(21JCZXJC00170)
天津市教委科研计划项目(2017KJ087)。
关键词
网络表示学习
属性信息
社区结构
节点分类
链接预测
可视化
network representation learning
attribute information
community structure
node classifications
link prediction
visualization