期刊文献+

大数据下的基于主题模型的社交网络链接预测 被引量:2

Social Networking Link Prediction Based on Topic Model under Big Data
下载PDF
导出
摘要 计算机技术和网络的发展使得数据呈爆炸式的涌现,社交媒体不断融入到人们的生活中,社会网络分析已成为研究的热点。随着大数据时代的到来,对社交网络链接算法研究产生巨大影响,原有的基于网络结构的预测方法已经渐渐不适应现状。因此,提出了一种基于主题模型的社交网络链接预测方法。首先以微博社交网络为数据源,将实验网络分为测试集和训练集;其次利用主题模型得到用户的主题特征,结合命名实体集和用户联系特征集合得到用户的兴趣特征相似性度量,加上网络结构相似性从而得到用户节点相似度,进而对社交网络链接进行预测;最终使用链接预测最常用的评价体系AUC来评价链接预测方法的效果。通过实验验证,该方法的预测准确率更高。 With the development of computer technology and network,data emerge explosively,and social media constantly integrate into people’s life. Social network analysis has become a research hotspot. With the advent of big data era,the research on social network link algorithm has a great impact. The original network structure-based prediction method has gradually become unsuitable for the status quo. Therefore,we propose a social network link prediction method based on topic model. Firstly,the experimental network is divided into test set and training set with the Microblog social network as the data source. Secondly,the topic model is used to obtain users’ topic features,and the similarity measure of users’ interest features is obtained by combining the named entity set and the user association feature set. Moreover,the similarity degree of user nodes is obtained by combining the network structure similarity,so as to predict the social network links.Finally,the link prediction method is evaluated by AUC,the most common evaluation system of link prediction. The experiment shows that the proposed method has higher prediction accuracy.
作者 骆梅柳 裴可锋 LUO Mei-liu;PEI Ke-feng(Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;Department of Information,Jiangsu College of Finance&Accounting,Lianyungang 222061,China)
出处 《计算机技术与发展》 2020年第4期36-40,共5页 Computer Technology and Development
基金 2018年江苏省高校哲学社会科学研究基金项目(2018SJA2019)。
关键词 大数据 网络链接 主题模型 命名实体 联系特征 big data networking link topic model named entity connection characteristics
  • 相关文献

参考文献8

二级参考文献118

  • 1姜波,张晓筱,潘伟丰.基于二部图的服务推荐算法研究[J].华中科技大学学报(自然科学版),2013,41(S2):93-99. 被引量:6
  • 2王燕.一种改进的K-means聚类算法[J].计算机应用与软件,2004,21(10):122-123. 被引量:9
  • 3朱靖波,叶娜,罗海涛.基于多元判别分析的文本分割模型[J].软件学报,2007,18(3):555-564. 被引量:15
  • 4石晶,戴国忠.基于PLSA模型的文本分割[J].计算机研究与发展,2007,44(2):242-248. 被引量:25
  • 5Kehagias A, Nicolaou A, Petridis V, Fragkou P. Text segmentation by product partition models and dynamic programming. Mathematical and Computer Modeling, 2004, 39(2-3): 209-217.
  • 6Gina-Anne L. Prosody-based topic segmentation for mandarin broadcast news. In: Proceedings of the 9th American Chapter of the Association for Computational Linguistics- Human Language Technologies. Boston, USA: Association for Computational Linguistics, 2004. 137-140.
  • 7Olivier F. Using collocations for topic segmentation and link detection. In: Proceedings of the 19th International Conference on Computational Linguistics. Taipei, China: Association for Computational Linguistics, 2002. 1-7.
  • 8Li H, Yamanishi K. Topic analysis using a finite mixture model. Information Processing and Management, 2003, 39(4): 521-541.
  • 9Hofmann T. Probabilistic latent semantic analysis. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. Stockholm, Sweden: Morgan Kaufmann, 1999. 289-296.
  • 10Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3:993-]022.

共引文献198

同被引文献24

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部