摘要
图节点的低维嵌入在各种预测任务中是非常有用的,如蛋白质功能预测、内容推荐等。然而,多数方法不能自然推广到不可见节点。图采样聚合算法(Graph Sample and Aggregate,Graphsage)虽然可以提高不可见节点生成嵌入的速度,但容易引入噪声数据,且生成的节点嵌入的表示能力不高。为此,文中提出了一种基于KNN与矩阵变换的图节点嵌入归纳式学习算法。首先,通过KNN选取K个邻节点;然后,根据聚合函数生成聚合信息;最后,利用矩阵变换与全连接层对聚合信息和节点信息进行计算,得到新的节点嵌入。为了有效权衡计算时间与性能,文中提出一种新的聚合函数,对邻节点特征运用最大池化作为聚合信息输出,以更多地保留邻节点信息,降低计算代价。在reddit和PPI两个数据集上的实验表明,所提算法在micro-f1和macro-f1两个评价指标上分别获得了4.995%与10.515%的提升。因此,该算法可以大幅减少噪声数据,提高节点嵌入的表示能力,快速有效地为不可见节点及不可见图生成节点嵌入。
Low-dimensional embedding of graph nodes is very useful in various prediction tasks,such as protein function prediction,content recommendation and so on.However,most methods cannot be naturally extended to invisible nodes.Graph Sample and Aggregate(Graph Sample and Aggregate,Grasage)algorithm can improve the speed of invisible node generation embedding,but it is easy to introduce noise data,and the representation ability of generated node embedding is not high.In this paper,an inductive learning algorithm based on KNN and matrix transformation for graph node embedding is proposed.Firstly,K neighbo-ring nodes are selected by KNN.Then aggregation information is generated by aggregation function.Finally,aggregation information and node information are calculated by matrix transformation and full connection layer,and new node embedding is obtained.In order to balance computing time and performance effectively,this paper proposes a new aggregation function,which uses maximum pooling as aggregation information output for neighbor node features,retains more neighbor node information and reduces computing cost.Experiments on two data sets of reddit and PPI show that the proposed algorithm achieves 4.995%and 10.515%improvement on micro-f1 and macro-f1,respectively.The experimental data fully show that the algorithm can greatly reduce noise data,improve the representation ability of node embedding,and quickly and effectively generate node embedding for invisible nodes and invisible graphs.
作者
贺苗苗
郭卫斌
HE Miao-miao;GUO Wei-bin(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
出处
《计算机科学》
CSCD
北大核心
2021年第3期201-205,共5页
Computer Science
基金
国家自然科学基金(61672227)。
关键词
低维嵌入
KNN
节点嵌入
聚合函数
表示能力
Low dimensional embedding
KNN
Node embedding
Aggregation function
Representation ability