摘要
针对现有评论质量评估研究中未考虑到个体的差异性以及参照物的选取存在误差的问题.该文选取博主作为评估人,以被博主回复的评论作为高质量评论的参照物,提出基于最大熵的微博评论质量评估模型.通过爬虫和词向量抽取特征,对抽取的特征进行特征选择,依据特征选择的结果,采用监督学习的方式训练分类模型并用测试数据验证所提模型的有效性.实验表明,该文所提模型对于不同的博主具有广泛适用性,评论分类的平均准确率、召回率和F值可达到66.64%、86.33%、75.2%.
Study on quality evaluation of existing comments did not consider the individual differences and the selection of referenceerrors. This paper selects the blogger as the evaluation person, Select the review to be replied by blogger as high quality reviews, aquality evaluation model of micro-blog review based on maximum entropy is proposed. Feature extraction by crawler and word vector,feature selection were performed on the extracted features, according to the results of feature selection, supervised learning is used totrain the classification model and the validity of the proposed model is verified by test data. Experimental results show that the pro-posed model is widely applicable to different bloggers,and the average accuracy,recall rate and F value can be reached 66.64% ,86.33% and 75.2%.
出处
《小型微型计算机系统》
CSCD
北大核心
2018年第1期58-63,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(81360230
81560296)资助
关键词
最大熵
评论质量
词向量
特征选择
监督学习
分类
smaximum entropy
review quality
word vector
feature selection
supervised learning
classification