期刊文献+

基于边界特征的情感新词提取方法

Method for new sentiment word extraction based on boundary feature
原文传递
导出
摘要 情感词典作为情感分析任务中的一项基础资源,是观点发现及情感极性判断的重要依据。随着网络新词的大量出现,情感新词的抽取成为一个亟待解决的问题。针对这一问题提出基于边界特征的情感新词的提取方法。该方法利用skip-gram模型挖掘情感词的边界特征、构建边界特征集,利用边界特征集提取情感新词候选集,通过bigram搭配、序列模式等方法对情感新词候选集进行过滤,根据候选串的频次、与其搭配的边界特征在语料中的分布情况对候选串进行评分。在微博语料上的实验结果显示,该方法对情感新词识别的准确率与候选串得分正相关,当候选串得分为11时准确率为83.33%。实验证明,基于边界特征的情感新词的提取方法能够有效地识别大规模语料中的情感新词。 Sentiment dictionary is one of basic language resources. It is an important basis for opinion mining and senti- mental orientation identification. With the new words teeming, new sentiment word extraction is a problem demanding to be solved. In order to solve this problem, this paper presents a method to extract new sentiment words based on boundary fea- ture. It uses skip-gram model and existing sentiment words to extract boundary feature of sentiment words and construct the set of boundary feature. Then it extracts new sentiment words with boundary feature. After the filtering about bigTam and ar- ray model, to score the candidate words. Experimental result on microblog data show that the precision is positively related to the candidate score. The precision is 83.33% when candidate score is 11. The experiment proved that this method is a- ble to extract new sentiment words effectively in biz scale data.
作者 朱波 侯敏
出处 《重庆邮电大学学报(自然科学版)》 CSCD 北大核心 2014年第6期796-802,共7页 Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
关键词 情感新词 边界特征 skip-gram 序列模式 sentiment word boundary feature skip-gram array model
  • 相关文献

参考文献20

  • 1RILOFF Ellen, WIEBJanyce e. Learning extraction pat- terns for subjective expressions [ C]//Proceedings of the 2003 EMNLP conference. Sapporo, Japan: Conference Publications,2003 : 70-77.
  • 2QIU Likun, ZHANG Weishi, HU Changjian, et al. SELC: A self-supervised model for sentiment classifica- tion [ C ]//Proceedings of CIKM. Hong Kong, China : Con- ference Publications, 2009:929-936.
  • 3LI Si, HE Hui, XU Weiran, et al. Automatic Chinese sentiment word extraction based on maximum entropy [ C ]//Proceeding of the 2009 International Conference on Wavelet Analysis and Pattern Recognition Baoding. [ s. 1. ] : Conference Publications,2009:437-441.
  • 4ZHENG Xiaolin, LIN Zhen, WANG Xiaowei, et al . In- corporating appraisal expression patterns into topic model-ing fi)r aspect and sentiment word identification [ J ]. Knowledge-Based Systems,2014,61 ( 5 ) : 29-47.
  • 5THELEN M, R1LOFF E. A Bootstrapping method for learning semantic lexicons using extraction Pattern Con- texts [ C]//Proceedings of EMNLP. Stroudsburg PA. USA : Association for Computational Linguistics, 2002 : 214-221.
  • 6KANAYAMA H, NASUKAWA T. Fully automatic lexi- con expansion for domain-oriented sentiment analysis [ C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Sydney: Asso- ciation for Computational Linguistics,2006:355-363.
  • 7KAJI N, KITSUREGAWA M. Building lexicon for senti- ment analysis from massive collection of html documents [ C ]//Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning. Prague : Association for Compu- tational Linguistics ,2007 : 1075-1083.
  • 8彭学仕,孙春华.面向倾向性分析的基于词聚类的基准词选择方法[J].计算机应用研究,2011,28(1):114-116. 被引量:7
  • 9路斌,万小军,杨建武,等.基于同义词词林的词汇褒贬计算[C]//中国计算技术与语言问题研究-第七届中文信息处理国际会议论文集.北京:电子工业出版社,2007:17-23.
  • 10KIM S M, HOVY E. Extracting opinions, opinion hold- ers, and topics expressed in online news media text [C]//Proceedings of ACL/COLING Workshop on Senti- ment and Subjectivity in Text. Sydney, Australia: Con- ference Publications ,2006 : 1-8.

二级参考文献33

  • 1胡和平,曾庆锐,路松峰.中文词聚类研究[J].计算机工程与科学,2006,28(1):122-124. 被引量:9
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:120
  • 4何燕,穗志方,段慧明,李素建.基于专业术语词典的自动领域本体构造[J].情报学报,2007,26(1):65-70. 被引量:13
  • 5王根,赵军.中文褒贬义词语倾向性的分析[C].第三届学生计算语言学研讨会论集,2006:81-85.
  • 6PETER D.Turney.Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL)//Philadelphia,PA,USA.2002; 417-424.
  • 7PETER D.Turney and MICHAEL L.Littman.Measuring praise and criticism:inference of semantic orientation from association[J].ACM Transactions on Information Systems,2003,21(4):315-346.
  • 8PETER D.Turney and MICHAEL L.Littman.Unsupervised learning of semantic orientation from a hundred-billion-word corpus[R].Tech.Rep.EGB-1094,National Research Council Canada:2002.
  • 9DAVE K.,LAWRENCE S.,and PENNOCK D..Mining the peanut gallery.,opinion extraction and semantic classification of product reviews[C]//Proceedings of the 22nd International World Wide Web Conference.Budapest,Hungary:2003.
  • 10YUEN Raymond W.M.,CHAN Terence Y.W.,LAI Tom B.Y.et al.Morpheme-based derivation of bipolar semantic orientation of Chinese words[C]//Proc.Of the 20th International Conference on Computational Linguistics (COLING-2004),Geneva,Switzerland.2004:1008-1014.

共引文献345

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部