摘要
为了从互联网环境下的用户评论中分析企业产品的缺陷,利用半监督分类中基于分歧的Co-forest算法对用户关于产品的评论进行文本分类,对Co-forest算法识别出的缺陷评论再基于主题模型BTM算法进行缺陷主题聚类,得到缺陷主题、主题描述详情及占比。以某品牌的一款畅销除湿机为例,对京东网站的评论进行相关研究。研究结果表明:Co-forest算法在基于在线评论的缺陷识别分类上相对于以往研究所采用有监督分类以及半监督分类Tri-training方法具有更高的性能。
This paper describes an effort to analyze the defects of enterprise products from Internet users comments. The analysis involves performing text classification of the product reviews using semi supervised classification algorithm based on user Co-forest differences; providing defect topic clustering of defect review identified by Co-forest algorithm based on BTM algorithm based on topic model in a way that affords the defect theme,topic description details,and the proportion; and conducting related research on the Jingdong website comments using a brand of a best-selling dehumidifier as an example. The results show that the Co-forest algorithm boasts a higher performance than supervised classification and semi supervised classification Tri-training method in terms of defect recognition and classification based on online reviews.
出处
《黑龙江科技大学学报》
CAS
2017年第6期698-704,共7页
Journal of Heilongjiang University of Science And Technology
关键词
缺陷识别
在线评论
半监督分类
主题聚类
除湿机
defect identification
online reviews
semi-supervised classification
topic clustering
de-humidifier