期刊文献+

伪相关反馈的文本情感分类方法 被引量:1

Method of Text Sentiment Classification Based on Pseudo Relevance Feedback
下载PDF
导出
摘要 机器学习过程中,由于训练集不完备,有必要构建具备主动学习能力的增量模型。对基于伪相关反馈的增量模型,现有的增量学习方法提出了一些选择反馈样本的策略,但对提高反馈样本类置信度的深入研究仍具有重要意义。针对这一问题,提出了基于K-Means聚类的伪相关反馈策略。对朴素贝叶斯分类器分类后的文档,用减量寻找质心向量的方式提取反馈文档以及新特征集合,对NB分类器进行反馈,将伪相关反馈策略运用于中文文本情感分类。实验表明,提取质心向量的准确率随反馈规模的扩大有所提高。方法从一定程度上实现了将后验概率转换为先验概率,随新特征的增加,配合CHI阈值调整可获取较高的查准率和查全率,证明了方法的可行性。 In the process of machine learning, it is necessary to build incremental model with automatic learning capabilities. For incremental model based on Pseudo-relevance feedback, the research on how to improve the confi- dence of feedback samples is still important, although some feedback strategy had been given. This paper presented a pseudo relevance feedback method based on K-Means clustering. For documents classified by Naive Bayesian classi- fier, we searched the center vector by means of reducing the sample number gradually, and extracted feedback sam- ples and feature concentration using for improve the performance of NB classifier. We carried out experiments in Chi- nese text sentiment classification according to the pseudo relevance feedback strategy. This method converts the poste- rior probability into prior probability in a degree. The results show that with the expansion of feature concentration, the strategy can achieve better than baseline in precision and recall.
出处 《计算机仿真》 CSCD 北大核心 2013年第11期268-271,共4页 Computer Simulation
关键词 伪相关反馈 情感分类 朴素贝叶斯 聚类 Pseudo relevance feedback Sentiment classification Naive Bayesian K-Means clustering
  • 相关文献

参考文献10

二级参考文献79

共引文献236

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部