期刊文献+

跨社交网络的同一用户识别算法 被引量:2

User alignment across social networks
下载PDF
导出
摘要 针对跨社交网络的同一用户识别问题,提出了一种综合用户兴趣、写作风格和档案属性的识别方法。通过在这3种不同的特征维度下分别判定用户关系,然后综合判定结果 ,提高同一用户识别准确性。其中,用户兴趣分为静态兴趣和动态兴趣,静态兴趣采用TextRank算法从用户背景信息中提取,动态兴趣则利用主题模型从用户发表的文本内容中挖掘出随时间变化的兴趣点。对于用户写作风格则通过One-Class SVM算法进行识别,最后利用信息熵赋权法比较用户档案属性相似度。实验结果表明,与传统机器学习算法相比,所提算法精确率、召回率均有所提升。 For the problem of identifying the same user across social networks,a recognition method that integrates user interests,writing style and profile attributes is proposed.By determining user relationships under these three different feature dimensions separately,and then synthesizing the results,the same user identification accuracy is improved.Among them,user interest is divided into static interest and dynamic interest,static interest is extracted from user background information by TextRank algorithm,while dynamic interest is mined from user published text content by using topic model to find out interest points that change over time.For user writing style,it is identified by One-Class SVM algorithm,and finally,the information entropy empowerment method is used to compare the similarity of user profile attributes.The experimental results show that the proposed algorithm has improved accuracy and recall rate compared with traditional machine learning algorithms.
作者 沈佳琪 周国民 Shen Jiaqi;Zhou Guomin(College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China;Department of Computer and Information Security,Zhejiang Police College,Hangzhou 310053,China)
出处 《电子技术应用》 2022年第1期109-114,共6页 Application of Electronic Technique
基金 NSFC-浙江两化融合联合基金(U1509219) 浙江省自然科学基金委公益技术计划研究项目(LGF19F02006) 公安部科技强警基础工作专项(2018GABJC33,2019GABJC36)。
关键词 跨社交网络 用户识别 用户兴趣 写作风格 档案属性 across social networks users identification user interest writing style file attribute
  • 相关文献

参考文献9

二级参考文献71

  • 1贺敏,王丽宏,杜攀,张瑾,程学旗.基于有意义串聚类的微博热点话题发现方法[J].通信学报,2013,34(S1):256-262. 被引量:12
  • 2贾自艳,何清,张海俊,李嘉佑,史忠植.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280. 被引量:58
  • 3[1]Thisted R,Efron B.Did shakespeare write a newly-discovered poem?[J].Biometrika,1987,74(3):445-455.
  • 4[2]David I Holmes.Stylometry:its origins,development and aspirations[A].Joint international conference of the association for computers and the humanities and the association for literary and linguistic computing[C].Norway:University of Bergen,1997.98-103.
  • 5[3]Mendenhall T C.The characteristic curves of composition[J].Science,1887,IX,237-249.
  • 6[4]Yule G U.On sentence length as a statistical characteristic of style in prose with application to two cases of disputed authorship[J].Biometrika,1938,30:363-390.
  • 7[5]Zipf G K.Selected studies of the principle of relative frequency in language[M].Cambridge,Massachusetts:Harvard University Press,1932.
  • 8[6]Zipf G K.Human behavior and the principle of least effort.An Introduction to Human Ecology[M].Cambridge,Massachusetts:Addison-Wesley Press,1949.
  • 9[7]Mosteller F,Wallace D L.Inference and disputed authorship:the federalist[M].Reading,Massachusetts:Addison-Wesley,1964.
  • 10[8]Burrows J F.Word patterns and story shapes:the statistical analysis of narrative style[J].Literary and Linguistic Computing,1987,2:61-70.

共引文献102

同被引文献10

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部