摘要
该设计以机器学习为基础,通过编写爬虫程序对网上各大平台公开评论信息进行采集并根据评分同时进行数据标注,以朴素贝叶斯分类算法为基础,通过对数据的分词,拟合出文本情感与文本的关系模型,从而达到一个相较于传统情感字典更好的效果。同时以该算法为基础,设计开发一套大数据评论采集分析系统,通过分析互联网上的相关评论,将分析结果可视化展示给企业,帮助企业更好地了解产品的市场情况,定位产品的优缺点,从而帮助企业优化决策,制定合适的策略,获得更佳的市场表现。
This design is Based on machine learning,by writing crawlers of the major online platform for public comment information acquisition and according to the score data annotations at the same time,Based on the naive bayesian classification algorithm,Based on the data of participles,fitting out text emotional and relationship model of text,so as to achieve a better effect than traditional emotional dictionary.At the same time,on the basis of the algorithm,designed and developed a set of big comment on data acquisition analysis system,through the analysis of the comments of the Internet,showed the visualization analysis results to the enterprises,help enterprises to better understand the product of the market situation,the advantages and disadvantages of positioning products,thereby helping enterprise optimization decisions,formulate the right strategy,obtain a better market performance.
作者
韩帅康
江涛
张顺
HAN Shuai-kang;JIANG Tao;ZHANG Shun(School of Information Engineering,Zhengzhou University of Science and Technology,Zhengzhou 450054,China)
出处
《电脑知识与技术》
2020年第4期35-37,共3页
Computer Knowledge and Technology
基金
2019年大学生创新创业训练计划项目《大数据评论采集分析系统》(项目编号:DCY201915).
关键词
文本采集
朴素贝叶斯
机器学习
语义分析
text collection
naive Bayes
machine learning
semantic analysis