期刊文献+

一种基于扩展的两步文本倾向性分析方法 被引量:4

Two-step text orientation identification based on feature extension
下载PDF
导出
摘要 提出一种基于扩展的两步文本倾向性分析方法,该方法利用包含倾向性词表、否定词表、程度词表在内的情感词语对训练文本进行特征扩展,按照将情感词语和内容词语是否同等对待来构造两个分类器CF1和CF2;在分类时,对测试文本进行和训练文本类似的特征扩展,使用分类器CF1对其进行分类,对分类结果中的可靠部分直接做出判定,对分类结果中的不可靠部分利用分类器CF2进行二次分类并做出判定。实验结果验证了该方法的有效性。 This paper presents an extension-based two-step text orientation analysis method. This method uses sentiment words including orientation word list, negative word list and adverb of degree list to extend features of the training texts, and then constructs the classifier CF1 and the classifier CF2 according to whether sentiment words and content words are used in the same way or not. At the classification time, extend features of the testing texts in the same way as for the training texts and classify them with the classifier CF1. If the result of classification is reliable, make a judgment;if not, conduct the second classification for the testing texts with the classifier CF2. Experimental results have proved the effectiveness of the method.
出处 《计算机工程与应用》 CSCD 2012年第1期162-165,169,共5页 Computer Engineering and Applications
基金 国家自然科学基金(No.60703010) 重庆市教委科学技术研究项目(No.KJ070519)
关键词 中文信息处理 特征扩展 倾向性分析 构造分类器 Chinese information processing features extension orientation identification constructing classifier
  • 相关文献

参考文献10

  • 1Tumey P.Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[C]//Proc of the 40th Annual Meeting of the Association for Computational Linguistics, N J, USA, 2002: 412-417.
  • 2Dave K,Lawrence S,Pennock D.Mining the peanut gallery:opinion extraction and semantic classification of product reviews[C]// Proc of the 12th Int'l World Wide Web Conf.Budapest,Hungary: ACM Press, 2003 : 519-528.
  • 3Tong R M.An operational system for detecting and tracking opinions in on-line diseussion[C]//SIGIR Workshop on Operational Text Classification,NY, USA, 2001 : 1-6.
  • 4Hu M, Liu B.Mining and summarizing customer reviews[C]// Proc of Knowledge Discovery and Data Mining, NY, USA, 2004:168-177.
  • 5Liu H, Lieberman H, Selker T.A model of textual affect sensing using real-world knowledge[C]//Proc of the llth Int'l Conf on Intelligent User Interface,2003 : 125-132.
  • 6Pang Bo,Lee Lillian,Vaithyanathan S.Thumbs up?Sentiment classification using machine learning techniques[C]//Proc of the Conf on Empirical Methods in Natural Language Processing, Philadelphia, US, 2002: 81-86.
  • 7许小颖,陶建华.汉语情感系统中情感划分的研究[C].第一届中国情感计算及智能交互学术会议论文集.2003:199-205.
  • 8张桂宾.相对程度副词与绝对程度副词[J].华东师范大学学报(哲学社会科学版),1997,29(2):92-96. 被引量:79
  • 9张谊生.现代汉语副词的性质、范围与分类[J].语言研究,2000,20(1):51-63. 被引量:156
  • 10樊兴华,孙茂松.一种高性能的两类中文文本分类方法[J].计算机学报,2006,29(1):124-131. 被引量:70

二级参考文献16

  • 1张谊生.名词的语义基础及功能转化与副词修饰名词(续)[J].语言教学与研究,1997(1):136-143. 被引量:105
  • 2张谊生.名词的语义基础及功能转化与副词修饰名词[J].语言教学与研究,1996(4):57-75. 被引量:158
  • 3张谊生.状词与副词的区别[J].汉语学习,1995(1):11-15. 被引量:16
  • 4陆俭明.现代汉语副词独用刍议[J].语言教学与研究,1982(2):27-41. 被引量:112
  • 5Lewis D. D.. An evaluation of phrasal and clustered representalions on a text categorization task. In: Proceedings of SIGIR'92,the 15st ACM International Conference on Research and Development in Information Retrieval, Copenhagen, Denmark,1992, 37-50.
  • 6Sebastiani F,. Machine learning in automated text categorization. ACM Computing Surveys, 2002, 34(1): 1-47.
  • 7Lewis D.. Naive bayes at forty: The independence assumption in information retrieval. In: Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, 1998,4-15.
  • 8Salton G.. Automatic Text Processing: The Transformation,Analysis, and Retrieval of Information by Computer. Reading,MA: Addison Wesley, 1989.
  • 9Mitchell T. M.. Machine Learning. New York: McCraw Hill,1996.
  • 10Joachims T.. Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th European Conference on Machine Learning,Chemnitz, Germany, 1998, 137-142.

共引文献306

同被引文献33

  • 1刘永丹,曾海泉,李荣陆,胡运发.基于语义分析的倾向性文本过滤[J].通信学报,2004,25(7):78-85. 被引量:34
  • 2樊兴华,孙茂松.一种高性能的两类中文文本分类方法[J].计算机学报,2006,29(1):124-131. 被引量:70
  • 3曹勇刚,曹羽中,金茂忠,刘超.面向信息检索的自适应中文分词系统[J].软件学报,2006,17(3):356-363. 被引量:48
  • 4苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:386
  • 5陈然.网络论坛舆论领袖筛选方法初探[D].武汉:华中科技大学,2009.
  • 6HUANG X J, ZHAO J. Sentiment analysis for Chinese text [ J ]. Com- munications of CCF,2008,4(2).
  • 7LIU H Y,ZHAO Y Y,QIN B,et al. Target extraction and sentiment classification[ C]//Proc of the 10th Chinese National Conference on Computational Linguistics. 2009.
  • 8WILSON T, WIEBE J, HWA R. Recognizing strong and weak opinion clauses[J]. Computational Intelligence,2006,22(2) :73-99.
  • 9FAN Xing-hua, NIE Jian-yun. Link distribution dependency model for document retrieval [ J]. Journal of Information & Computational Science ,2009,6 ( 3 : 1553-1564.
  • 10Sajib Dasgupta, Vincent Ng. Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification [ C ]. Proceedings of the 47th Annual Meeting of the ACL and the 4th IJC- NLP of the AFNLP, Singapore,2009 : 701-709.

引证文献4

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部