摘要
随着互联网的普及和便利,现如今国内外点评网站和各类商务网站高速发展,各类评论信息正在不断影响着人们的生活。豆瓣网就是很知名的网络社区,越来越多互联网用户会在豆瓣网上发表对电影、图书和音乐等的评论,同时越来越多的人们会在看电影前、看书前或者是听音乐前看看豆瓣上的评分和评论去决定是否去看或听。所以此时垃圾评论的识别就至关重要,因为垃圾评论会影响人们对这个事物真实的看法。文中引入了语义分析、图书特征词典和垃圾评论词典。语义分析有利于检测垃圾评论附加功能,同时会使用权重比例过滤模型检测垃圾评论。实验结果表明,文中方法可以达到85.4%的准确率,能有效准确地识别垃圾评论。
With the popularization and convenience of the Internet,comment sites and various business websites at home and abroad are developing at a high speed,and various kinds of commentary information are constantly affecting people’s lives.Douban is a well-known online community.More and more users will post comments on movies,books and music on Douban.At the same time,more and more people will look at the ratings and comments on Douban before watching movies,reading books or listening to music to decide whether to watch or listen.So the identification of spam comments is crucial,because spam comments will affect people’s true perception of this thing.We introduce semantic analysis,book feature dictionary and spam dictionary.Semantic analysis is beneficial to the additional function of spam comment detection,and it can use the weight proportional filter model to detect spam comments.The experiment shows that the proposed method can achieve 85.4%accuracy and can effectively and accurately identify spam comments.
作者
刘高军
印佳明
LIU Gao-jun;YIN Jia-ming(School of Computer,North China University of Technology,Beijing 100144,China)
出处
《计算机技术与发展》
2019年第11期107-112,共6页
Computer Technology and Development
基金
国家自然科学基金(61672040)