期刊文献+

在线评论中基于边界平均信息熵的产品特征提取算法 被引量:10

An algorithm of online product feature extraction based on boundary average entropy
原文传递
导出
摘要 随着电子商务业务的迅猛发展,基于用户网上评论的文本研究也成为热点课题.用户在进行购买决策时,不仅需要了解该商品的整体评价,同时需要知道商品各个特征的情感态度倾向,故文章的目的在于研究在线评论中产品特征的自动提取的问题.实验选择满足BNP(base noun phrase)模式的N-Gram作为候选项,并利用N-Gram的边界平均信息熵的指标以及子串依赖关系对候选项进行过滤,提取最终的产品特征.与仅采取BNP模式直接作为产品特征的参照条件相比,当前方法选取的过滤条件可以有效提高产品特征提取的准确率.文中的方法不依赖于外部的领域语料且不需进行人工干预,其最终输出的结果具有子串依赖的层次性,可以作为领域知识构建的有效的参考数据结构. With the rapid development of e-commerce business, the research of text mining with online reviews has become a prevalence topic. While an end-user is making a purchasing decision, he is not only interested in whether the product is recommended, he also cares about the sentiment orientation corresponds to the product's detailed features. So this paper aims to solve the problem of automatically extracting the products features of the online reviews. In his paper, we choose the N-Grams that are in the pattern of BNP (base noun phrase) as candidate feature items. Additionally, we take advantage of the boundary average entropy of N-Grams and the substring dependency relationships among the items to filter the result. Referring to the final experiment outcomes, we conclude that the current filtering condition improves the accuracy of the result comparing with the baseline method, which directly designate the BNP as feature items. The current method does not rely on the outside domain corpus for training and is free from manual intervention. Also, one more meaningful aspect of the research is that the output result is in a hierarchical presentation of tree form and it will be beneficial for the further research oil the construction of domain knowledge ontology as a nice reference data structure.
出处 《系统工程理论与实践》 EI CSSCI CSCD 北大核心 2016年第9期2416-2423,共8页 Systems Engineering-Theory & Practice
关键词 在线评论 产品特征 边界平均信息熵 online reviews product feature boundary average entropy
  • 相关文献

参考文献15

  • 1Jones Q, Ravid G, Rafaeli S. Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical exploration[J]. Information Systems Research, 2004, 15(2): 194-210.
  • 2Nelson P. Information and consumer behavior[J]. Journal of Political Economy, 1970, 78(20): 311-329.
  • 3Hu M, Liu B. Mining opinion features in customer reviews[J]. AAAI, 2004, 4(4): 755-760.
  • 4Miller G A. WordNet: An on-line lexical database[J]. International Journal of Lexicography, 1990, 3(4): 235 -312.
  • 5姚天昉,娄德成.汉语语句主题语义倾向分析方法的研究[J].中文信息学报,2007,21(5):73-79. 被引量:78
  • 6Carenini G, Ng R T, Zwart E. Extracting knowledge from evaluative text[C]// Proceedings of the 3rd Interna- tional Conference on Knowledge Capture, ACM, 2005: 11-18.
  • 7Yi J, Nasukawa T, Bunescu R, et al. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques[C]// Data Mining, ICDM 2003. Third IEEE International Conference on. IEEE, 2003: 427-434.
  • 8Popescu A M, Etzioni O. Extracting product features and opinions from reviews[C]// Proceedings Confer- ence Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, 2005: 339-346.
  • 9李实,叶强,李一军,RobLaw.中文网络客户评论的产品特征挖掘方法研究[J].管理科学学报,2009,12(2):142-152. 被引量:130
  • 10韩雪婷,李炜,沈奇威.用户评论中产品特征的抽取及聚类[J].计算机系统应用,2013,22(5):188-192. 被引量:7

二级参考文献57

共引文献270

同被引文献109

引证文献10

二级引证文献78

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部