期刊文献+

一种半监督学习的中文微博主观句识别方法 被引量:2

Semi-supervised learning approach to recognize subjective sentences in microblogs
下载PDF
导出
摘要 微博中的主观句包含着人们对事物的态度、倾向等信息。微博本身字数的限制和语言结构的自由,使得在微博中发现主观句面临着许多困难。借鉴传统文本处理使用的词性和情感词典两类特征,通过AdaBoost方法选择并组合分类器。对于已标注数据比例较小的数据集,为了进一步提升分类器的性能,尝试着通过Bootstrapping过程迭代重构分类器,也就是不断地通过已有的分类器标注未标注数据集中的可信句子,并加入已标注数据集中,再重新训练分类器。实验结果表明,Bootstrapping的引入不仅能够提升分类器的F值,而且能减少分类器所携带的特征的数量,使集成分类器的精度和速度均有显著提高。 Subjectivity in natural language refers to aspects of language used to express opinions,evaluation,tendencies and other information. For microblogs,it is more difficult to find the subjective sentences due to the limited number of words and free structure of text. In order to select features from the sentences,this paper applied the AdaBoost algorithms,and organized them into composite classifier. Considering the poor performance when working on the small amount of labeled dataset,it used Bootstrapping process to label the most confident unlabeled sentences in the the unlabeled dataset and added them into training process to reconstruct the AdaBoost classifier iteratively. The experiments show that the Bootstrapping process elevate the F1-score of classifier,and decrease the number of features in AdaBoost classifier,which lead to conspicuous improvement in precision and speed.
出处 《计算机应用研究》 CSCD 北大核心 2014年第7期2035-2039,共5页 Application Research of Computers
基金 福建省科技计划重大重点项目(2011H6016 2011H0028)
关键词 微博 主观句 词性 ADABOOST BOOTSTRAPPING microblog subjective sentence part-of-speech AdaBoost Bootstrapping
  • 相关文献

参考文献1

共引文献73

同被引文献24

  • 1胡培安.从功能的角度看“时间”与“时候”[J].社会科学辑刊,2006(6):258-262. 被引量:4
  • 2王忠卿,王荣洋,庞磊,等.Suda_SAM_OMS情感倾向性分析技术报告[C]//第三届中文倾向性分析评测论文集.2011:25-32.
  • 3Hatzivassiloglou V,Weibe J.Effects of Adjective Orientation and Gradability on Sentence Subjectivity[C]∥Proceedings of ACL,2000:299-305.
  • 4Riloff E,Weibe J,Wilson T.Learning Subjective Nouns Using Extraction Pattern Bootstrapping[C]∥Proceedings of HLT-NAACL,2003:25-32.
  • 5Kim Soo Min,Hovy Eduard.Determining the Sentiment of Opinions[C]∥Proceedings of the COLING Conference.Geneva,2004:1367-1373.
  • 6Pang Bo,Lillian Lee.A Sentimental Education:Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts[C]∥Proceedings of the ACL,2004:271-278.
  • 7Pang Bo,Lillian Lee,Shivakumar Vaithyanathan.Thumbs up Sentiment Classification using Machine Learning Techniques[C]∥Proceedings of Conference on Empirical Methods in Natural Language Processing,2002.
  • 8Wu Y,Oard D.Chinese Opinion Analysis Pilot Task[C]∥Proceedings of the 6th NTCIR Workshop on Evaluation of Information Access Technologies,Maryland,2007:344-349.
  • 9刘志明,刘鲁.基于机器学习的中文微博情感分类实证研究[J].计算机工程与应用,2012,48(1):1-4. 被引量:122
  • 10姜鸿文,王凌云,孙少晶.医患期望及沟通能力研究:基于深度访谈与问卷调查[J].新闻大学,2013(3):90-95. 被引量:12

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部