摘要
为了实现对用户评论的商业研究价值提取,解决互联网产品后续优化和增进服务问题,提出一种融合朴素贝叶斯与决策树的改进算法,处理文本中的噪声,避免零概率和属性值缺失的问题,从而提高分类准确率。该算法首先对用户评论数据作预处理,然后运用概率优化后的朴素贝叶斯处理空缺属性值,最后用决策树从积极和消极角度将数据进行分类。对微信公众号用户评论数据集进行实验,结果表明改进后的算法准确率达80.27%,比传统方法提高0.5%。
In order to extract the business research value of user reviews and solve the problems of subsequent optimization and service improvement of Internet products,an improved algorithm combining naive Bayes and decision tree is proposed to deal with the noise in text and avoid the problems of zero probability and missing attribute values,so as to improve the classification accuracy.Firstly,the algorithm preprocesses the user comment data,then uses the probability optimized naive Bayes to deal with the missing attribute values,and finally uses the decision tree to classify the data from the positive and negative perspectives.The experimental results show that the algorithm achieves a 80.27% accuracy rate of 0.5% compared with the traditional method through experiments on WeChat official account user reviews dataset.
作者
贾晓帆
何利力
JIA Xiao-fan;HE Li-li(School of Informatics and Electronics,Zhejiang Sci-Tech University,Hangzhou 310018,China)
出处
《软件导刊》
2021年第7期1-5,共5页
Software Guide
基金
国家重点研发计划项目(2018YFB1700702)。
关键词
用户评论分类
决策树算法
朴素贝叶斯
user review classification
decision tree algorithm
Naive Bayes