摘要
基于广受欢迎的微博平台,利用新浪微博提供的API接口及网络爬虫技术从微博中提取用户数据,通过支持向量机算法(SVM)将微博用户分为水军用户和非水军用户两类。再利用改进的支持向量机算法(SVM)从大量的用户数据中提取特征值,实现多分类支持向量机模型,将用户分为正常用户、炒作型水军、营销型水军、谣言型水军四类。研究结果表明,构建的模型可以较为准确地识别出用户的类型,识别误差率较低。
Based on the popular MicroBlog platform,the user data is extracted from MicroBlog by using the API interface provided by Sina MicroBlog and Web crawler technology,and the MicroBlog users are divided into two categories of water army users and non water army users by support vector machine (SVM) algorithm.Then the improved support vector machine (SVM) algorithm is used to extract feature values from a large number of user data to realize a multi classification support vector machine model.Users are divided into four categories:normal users,hyped water army,marketing water army and rumor water army.The study results show that the constructed model can accurately identify the types of users,and the recognition error rate is low.
作者
李新焕
黄伟力
LI Xinhuan;HUANG Weili(Jiangxi Engineering Vocational College of Jiangxi Open University,Nanchang 330046,China)
出处
《现代信息科技》
2022年第16期107-109,共3页
Modern Information Technology
基金
江西省教育厅科技项目(GJJ205702)。