摘要
随着各种社交网络的不断涌现,越来越多的研究者开始从多源的角度分析社交网络数据,多社交网络的数据融合依赖于跨网络用户身份识别。针对现有的基于好友关系(FRUI)算法对社交网络中的异质关系利用率不高的问题,提出了基于带权超图的跨网络用户身份识别(WHUI)算法。首先,通过在好友关系网络上构建带权超图来准确地描述同一网络中的好友关系及异质关系,以此提高表示节点所处拓扑环境的准确性;然后,在构建好的带权超图的基础上,根据节点所处拓扑环境在不同网络中大致相同这一特性,定义节点之间的跨网络相似性;最后,结合迭代匹配算法,每次选取跨网络相似性最高的用户对进行匹配,并加入双向认证和结果剪枝来保证识别准确率。在合作网络DBLP和真实社交网络上进行了实验,实验结果表明,在真实社交网络上,所提算法相比FRUI算法,平均准确率提高了5.5个百分点,平均召回率提高了3.4个百分点,平均F值提高了4.6个百分点。在只有网络拓扑信息的情况下,所提WHUI算法有效提高了实际应用中身份识别的准确率和召回率。
With the emergence of various social networks, the social media network data is analyzed from the perspective of variety by more and more researchers. The data fusion of multiple social networks relies on user identification across social networks. Concerning the low utilization problem of heterogeneous relation between social networks of the traditional Friend Relationship-based User Identification (FRUI) algorithm, a new Weighted Hypergraph based User Identification (WHUI) algorithm across social networks was proposed. Firstly, the weighted hypergraph was accurately constructed on the friend relation networks to describe the friend relation and the heterogeneous relation in the same network, which improved the accuracy of presenting topological environment of nodes. Then, on the basis of the constructed weighted hypergraph, the cross network similarity between nodes was defined according to the consistency of nodes' topological environment in different networks. Finally, the user pair with the highest cross network similarity was chosen to match each time by combining with the iterative matching algorithm, while two-way authentication and result pruning were added to ensure the recognition accuracy. The experiments were carried out in the DBLP cooperation networks and real social networks. The experimental results show that, compared with the existing FRUI algorithm, the average precision, recall, F of the proposed algorithm is respectively improved by 5.5 percentage points, 3.4 percentage points, 4.6 percentage points in the real social networks. The WHUI algorithm can effectively improve the precision and recall of user identification in practical applications by utilizing only network topology information.
出处
《计算机应用》
CSCD
北大核心
2017年第12期3435-3441,3471,共8页
journal of Computer Applications
基金
国家自然科学基金资助项目(61521003)~~
关键词
跨网络用户身份识别
带权超图
异质关系
节点相似度
迭代匹配
user identification across social network
weighted hypergraph
heterogeneous relation
node similarity
iterative matching