期刊文献+

基于Bootstrapping的英文产品评论属性词抽取方法 被引量:1

Feature extraction method based on Bootstrapping in English product comment
原文传递
导出
摘要 针对英文产品方面属性词抽取,提出了一种基于Bootstrapping的抽取方法。该方法利用少数几个种子模板,通过增量迭代的过程发现新的属性词,在每一轮迭代中通过统计技术,结合情感词典的情感词分析,利用属性词与模板的亲密度关系得到属性词被抽取出的概率得分,对候选属性词进行排序过滤。对于抽取后的特征词集利用Wordnet计算属性词间的相似度,根据得分进行聚类,得到产品不同方面的属性词类簇,同时过滤掉得分较低的类簇,进一步去掉噪声。此外还利用种子模板代替种子属性词以提高系统的可移植性。实验结果表明,利用该方法进行产品方面属性词抽取的准确率为0.799,召回率为0.779,调和平均值为0.789,具有较好的抽取性能。 An feature extraction method based on Bootstrapping in English product comment was proposed. By this method, starting with a set of extraction patterns as seeds, and then applying an incremental iterative procedure to find new features. During the process of the each iteration, the system ranks the new features by score, which is calculated by the intimacy relationship between the candidate features and patterns. This is useful for prevent topic drift. After ex- tracting features, WordNet is used to calculate the similarity between features. Then clustering the features by the simi- larity score, get different aspects of the product features, then filtering out the low score of the class clusters, remove noise. What's more, to improve the portability of the system, the seed features are replaced by seed patterns. Experi- mental results show that extracting features by this method has a good result, the precision, recall and F-measure reach 0. 799, 0. 779, 0. 789 and it has good extraction performance.
作者 王辉 陈光
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2014年第12期23-29,共7页 Journal of Shandong University(Natural Science)
基金 高等学校学科创新引智计划(111计划)项目(B08004) 科技重大专项项目(2011ZX03002-005-01) 国家自然科学基金资助项目(61273217) 博士点基金资助项目(20130005110004)
关键词 属性词抽取 自扩展 信息抽取 WORDNET feature extraction bootstrapping information extraction wordnet
  • 相关文献

参考文献18

  • 1THET T T, NA J C, KHOO C S. Aspect-based sentiment analysis of movie reviews on discussion boards [ J]. Jour- nal of Information Science, 2010, 36(6) : 823-848.
  • 2HU Minjing, LIU Bing. Mining and summarizing cus- tomer reviews[ C]//Proceedings of the 10th ACM SIGK- DD International Conference on Knowledge Discovery and Data Mining(KDD'04). New York: ACM, 2004: 168- 177.
  • 3RAJU S, PINGALI P, VARMA V. An unsupervised ap- proach to product attribute extraction[ C]//Proceedings of the 31th European Conference on IR Research on Ad- vances in Information Retrieval. New York: ACM, 2009 : 796-800.
  • 4ARUN A, SRINIVASAN P. Automated query generation of Rdbms for informationand knowledge extraction [ C ]// Proceedings of 2013 International Conference on Informa- tion Communication and Embedded Systems. Chennai: IEEE Press,2013 : 468-473.
  • 5MANNA! M. Ben Abdessalem Karaa W. Bayesian infor- mation extraction network for medline abstract [ C ]//Pro- ceedings of 2013 International Conference on Computer and Information Technology (WCCIT). Sousse: IEEE Press,2013 : 1-3.
  • 6PROBST K, GHAI M K R, FANO A, et al. Semi-super- vised learning of attribute-value pairs from product de- scription[ C ]//Proceedings of the 20th International Joint Conference on Artificial Intelligence. Freiburg: IJCAI- INT, 2007:2838-2843.
  • 7GAMON M, AUE A, OLIVER S, et al. Mining custom- er opinions fromm text[ C]//Proceedings of the 6th Inter- national Symposium on Intelligent Data Analysis. [ s. 1. ] : Springer-Verlag, 2005: 897-968.
  • 8LIMA R, OLIVEIRA H, et al. Information extraction from the web: an ontology-based method using inductive logic programming [ J ]. Tools with Artificial Intelli- gence, 2013, 30: 741-748.
  • 9QIU Guang, LIU Bing, BU Jiajun, et al. Opinion word expansion and target extraction through double propagation[J]. Computational Linguistics, 2011, 37 (1): 9-27.
  • 10宋乐,何婷婷,王倩,闻彬.极性相似度计算在词汇倾向性识别中的应用[J].中文信息学报,2010,24(4):63-67. 被引量:5

二级参考文献18

  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2李峰,李芳.中文词语语义相似度计算——基于《知网》2000[J].中文信息学报,2007,21(3):99-105. 被引量:105
  • 3赵军,许洪波,黄萱菁,谭松波,刘康,张奇.中文倾向性分析评测技术报告[C]//第一届中文倾向性分析评测会议(The First Chinese Opinion Analysis Evaluation).COAE,2008.
  • 4M.M. Bradley, and P.J. Lang. Affective Norms for English Words(ANEW): Stimuli, Instruction Manual and Affective Ratings[R]// Technical report C-1, Gainesville, FL. The Center for Research in Psychophysiology, University of Florida, Florida, USA: 1999.
  • 5Vasileios Hatzivassiloglou and Kathleen R. McKeown. Predicting the semantic orientation of adjectives [C]//Proceedings of the of the Association for Computational Linguistics and the 8^th Conference of the European Chapter of the ACL C, 1997:174-181.
  • 6Peter D. Turney and Michael L. Littman. Measuring p raise and criticism: Inference of semantic orientation from association[J].ACM Transactions on Information Systems, 2003, 21 (4): 315-346.
  • 7Yu H, Hatzivassiloglou V. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentenees[C]//M. Collins and M. Steedman(eds) : Proc. of the EMNLP-03:The 8^th Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan, July, 11-12. 2003: 129-136.
  • 8J. Kamps, M. Marx, R. J. Mokken and M. D. Rijke. Using WordNet to measure semantic orientation of adjectives[C]//Proceedings of LREC-04,4th International Conference on Language Resources and Evaluation, Lisbon,2004 : 1115-1118.
  • 9HowNet R. HowNet's Home PagerDB/OI.]. http:// www. keenage. com.
  • 10Hu Minjing, Liu Bing. Mining Opinion Features in Customer Reviews[C]//Proceedings of the 19th National Conference on Artifical Intelligence. [S.l.]: ACM Press, 2004.

共引文献15

同被引文献8

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部