摘要
网络链接预测能够获取网络中丢失链接的重要信息或进行网络的动态演变分析.现有的基于节点相似性的网络链接预测方法往往针对简单的一(多)阶邻居信息或特定类型的小型网络,设计较为复杂的计算方法,其扩展性和大规模网络中的可计算性都受到了严峻的挑战.文中基于深度学习在神经网络语言模型中应用的启发,提出了一个LsNet2Vec(Large-scale Network to Vector)模型.通过结合随机游走的网络数据集序列化方法,进行大规模的无监督机器学习,从而将网络中节点的结构特征信息映射到一个连续的、固定维度的实数向量.然后,使用学习到的节点结构特征向量,就可以迅速计算大规模网络中任意节点之间的相似度,以此来进行网络中的链接预测.通过在16个大规模真实数据集上和目前的多个基准的最优预测算法对比发现,LsNet2Vec模型所得到的预测总体效果是最优的:在保证了大规模网络中链接预测计算可行性的同时,于多个数据集上相对已有方法呈现出较大的AUC值提升,最高达8.9%.
The problem of link prediction can be categorized into two classes, namely, missing links prediction and future links prediction. The former is the prediction of unknown links in sampling networks; and the other is the prediction of links that may exist in the future of evolving complex networks. Until now, most of the methods for link prediction are designed based on the assumption of node similarity, which defined by using the essential features of nodes. The similarity evaluation of two nodes making the sparsity and huge size of networks become two of the main challenges remain in link prediction problems. In this work, we present a new model, named LsNet2Vec, for link prediction in large-scale networks according to the unsupervised machine learning. The main idea of our method is embedding the features of nodes in large-scale networks into a lower and fixed dimension of vector in the set of real numbers. We conduct extensive experimental analysis on sixteen famous datasets and present a controlled comparison of the LsNet2Vec model against several strong baselines of link prediction methods, SUC h AUC testing. Result show h as Katz index and random Keywords link neural network ; that our model performs comparably with state-of-the-art methods, walk restart method, in various experiment settings.
出处
《计算机学报》
EI
CSCD
北大核心
2016年第10期1947-1964,共18页
Chinese Journal of Computers
基金
国家自然科学基金(71271211
71531012)
北京市自然科学基金(4132067)
中国人民大学科学研究基金(10XNI029)
中国人民大学2015年度拔尖创新人才培育资助计划资助~~
关键词
链接预测
大规模网络
节点特征向量
连续性表达
神经网络
机器学习
link prediction
large- scale networks
node feature vector
distributed representation neural network
machine learning