摘要
论坛和博客等社交媒体中的数据,能够充分反映社交网络用户的思想行为动机,为了分析这些数据,挖掘用户的一些行为动机,提出一种基于内容相似性的社交网络用户行为倾向性分析方法,将社交媒体中的数据处理成文档集并进行向量化,建立向量空间模型,然后通过这些文档之间的相似性来发现用户的归属.采用KNN算法将某用户在一段时间内所发布的文档进行分类,获得数量最多的目标类别为此用户的实际倾向类别,最后通过实验证明此方案的实际可行.
The user data in social media,such as forum and blog,can fully reflect the ideological and behavioral motivation of internet users,in order to analyze these data and detect the user' s motivation,an methods of users behavior tendentiousness analysis in social network based on content similarity was proposed in this paper,it processes these data to a set of documents and establishes vector space model,and then finding the similarity between these document was used to find the category of the users. Finally,KNN algorithm was used to classify the documents published by a user over a period of time,the category obtained the largest number of documents was the target tendency category of the user. Finally,the experiment showed the feasibility of this method.
出处
《吉林师范大学学报(自然科学版)》
2016年第4期135-139,共5页
Journal of Jilin Normal University:Natural Science Edition
基金
海南省自然科学基金项目(20156241)
海南省高等学校科学研究项目(Hnky2015-72)
琼台师范高等专科学校科研项目(qtky201404)
关键词
社交媒体
内容相似性
行为倾向性
文本分类
social media
content similarity
behavioral tendentiousness
text classification