摘要
在图结构数据上开展推理计算是一项重大的任务,该任务的主要挑战是如何表示图结构知识使机器可以快速理解并利用图数据。对比现有表示学习模型发现,基于随机游走方法的表示学习模型容易忽略属性对节点关联关系的特殊作用,因此提出一种基于节点邻接关系与属性关联关系的混合随机游走方法。首先通过邻接节点间的共同属性分布计算属性权重,并获取节点到每个属性的采样概率;然后分别从邻接节点与含有共有属性的非邻接节点中提取网络信息;最后构建基于节点−属性二部图的网络表示学习模型,并通过上述采样序列学习得到节点向量表达。在Flickr、BlogCatalog、Cora公开数据集上,用所提模型得到的节点向量表达进行节点分类的Micro-F1平均准确率为89.38%,比GraphRNA(Graph Recurrent Networks with Attributed random walks)高出了2.02个百分点,比经典工作DeepWalk高出了21.12个百分点;同时,对比不同随机游走方法发现,提高对节点关联有促进作用的属性的采样概率可以增加采样序列所含信息。
It is an important task to carry out reasoning and calculation on graph structure data.The main challenge of this task is how to represent graph-structured knowledge so that machines can easily understand and use graph structure data.After comparing the existing representation learning models,it is found that the models based on random walk methods are likely to ignore the special effect of attributes on the association between nodes.Therefore,a hybrid random walk method based on node adjacency and attribute association was proposed.Firstly the attribute weights were calculated through the common attribute distribution among adjacent nodes,and the sampling probability from the node to each attribute was obtained.Then,the network information was extracted from adjacent nodes and non-adjacent nodes with common attributes respectively.Finally,the network representation learning model based on node attribute bipartite graph was constructed,and the node vector representations were obtained through the above sampling sequence learning.Experimental results on Flickr,BlogCatalog and Cora public datasets show that the Micro-F1 average accuracy of node classification by the node vector representations obtained by the proposed model is 89.38%,which is 2.02 percentage points higher than that of GraphRNA(Graph Recurrent Networks with Attributed random walk)and 21.12 percentage points higher than that of classical work DeepWalk.At the same time,by comparing different random walk methods,it is found that increasing the sampling probabilities of attributes that promote node association can improve the information contained in the sampling sequence.
作者
周乐
代婷婷
李淳
谢军
楚博策
李峰
张君毅
刘峤
ZHOU Le;DAI Tingting;LI Chun;XIE Jun;CHU Boce;LI Feng;ZHANG Junyi;LIU Qiao(School of Information and Software Engineering,University of Electronic Science and Technology of China,Chengdu Sichuan 610054,China;Hebei Key Laboratory of Electromagnetic Spectrum Cognition and Control,Shijiazhuang Hebei 050081,China;Key Laboratory of Aerospace Information Applications,China Electronics Technology Group Corporation,Shijiazhuang Hebei 050081,China)
出处
《计算机应用》
CSCD
北大核心
2022年第8期2311-2318,共8页
journal of Computer Applications
基金
国家自然科学基金资助项目(U19B2028,61772117)
中国电子科技集团公司第五十四研究所开放课题(191055,201148,190900,200662)
中央高校基本科研业务费专项基金资助项目(ZYGX2019J077)。
关键词
网络嵌入
表示学习
随机游走
网络采样
属性网络
节点分类
network embedding
representation learning
random walk
network sampling
attributed network
node classification