期刊文献+

电商评论情感挖掘模型 被引量:3

Sentiment analysis method on electric business reviews
下载PDF
导出
摘要 通过对商品评论的挖掘,商家可以更好地了解消费者的需求从而及时改善产品的设计。目前,针对商品评论的挖掘大多数采用的方法是提取有效的情感特征并利用分类器进行分类。然而由于电商评论文本表述方式多样、行文不规范,口语化等特点,数据稀疏,文档特征维度过高,样本不均衡以及情感词典领域依赖性等问题都导致情感特征的提取过程愈发困难。为了解决这些问题,论文提出一整套针对电商评论挖掘方法,其融合多种策略构建电商领域情感词典;将文本长度作为特征;结合语料库对停用词表进行优化;将文档频率和TF-IDF算法结合进行特征选择和特征加权。论文以热水器评论作为语料库,以支持向量机为核心对所提出方法进行验证,实验结果证明所提出的方法能在降低文本维度的同时可大幅度提高情感分类的准确度。 Sentiment mining in products review can help the manufacturers understand needs of customers fully.By far,most of the approaches on review mining are extracting effective sentiment features and classifying them by classifiers.However,extracting sentiment features is very difficult due to the diversity of expression,colloquialization,non-standard writing,data sparsity,unbalanced samples,high feature dimension,and domain sentiment lexicon dependency.Thus,a novel model for review mining with a SVM is proposed,which builds an sentiment dictionary in E-commerce,combining the corpus to optimize the stop list,adding the text length as a feature and combining document frequency and TF-IDF for feature selecting and weighing to reduce the dimensionality of feature,which can effectively overcome the above drawbacks.Empirical analysis on corpus of Water Heater reviews demonstrates that our model not only achieves a significant performance on accuracy of sentiment classification but also can reduce the text dimension.
作者 熊乐 饶泓 XIONG Le,RAO Hong(Department of Information Engineering, Nanehang University, Nanehang 330031, Chin)
出处 《南昌大学学报(理科版)》 CAS 北大核心 2018年第1期88-94,共7页 Journal of Nanchang University(Natural Science)
基金 国家自然科学基金资助项目(61262047) 江西省重点研发计划基金资助项目(20171BBE50063) 江西省教育厅科技基金资助项目(GJJ14141)
关键词 情感分析 停用词表 情感词典 文档频率 TF-IDF Sentiment analysis;Stop list;Sentiment lexicon;Document frequency;TF-IDF
  • 相关文献

参考文献3

二级参考文献17

  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:383
  • 3KU L-W, LO Y-S, CHEN H-H. Using polarity scores of words for sentence-level opinion extraction [ C]// Proceedings of the 6th NTCIR-6 Workshop Meeting. Toyko, Japan: [ s. n. ], 2007:316 - 322.
  • 4王秉卿,张姝,张奇.中文情感词识别[C]//NCIRCS2008:第四届全国信息检索与内容安全学术会议.北京:[出版社不详],2008:63-69.
  • 5刘群 李素建.基于《知网》的词汇语义相似度的计算.中文计算语言学,2002,17(2):59-76.
  • 6王克,张春良,朱慕华,等.基于情感词词典的中文文本主客观分析[C].NCIRCS2008:第四届全国信息检索与内容安全学术会议.北京,2008.56-62.
  • 7知网[EB/OL].[2009-03-12].http://www.keenage.com.
  • 8TURNEY P D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews [ C]// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Morristown, N J, USA: Association for Computational Linguistics, 2002:417-424.
  • 9谭松波.中文情感挖掘语料-ChenSentiCorp[EB/OL].(2008-12-19)[2009-03-12].http://www.searchforum.org.cn/tansongbo/corpus-senti.htm.
  • 10KAJI N, KITSUREGAWA M. Building lexicon for sentiment analysis from massive collection of HTML documents [ C/OL]//EMNLPCoNLL 2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2007:1075 - 1083 [2009 -03 -08]. http://www. aclweb. org/anthology/D/D07/D07-1115. pdf.

共引文献165

同被引文献35

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部