期刊文献+

非平衡文本情感分类的数据集设计与评价指标

Dataset Design and Evaluation Index for Imbalanced Text Sentiment Classification
下载PDF
导出
摘要 随着非平衡分类问题研究的深入,训练数据与测试数据如何划分成为一个值得思考的问题。针对非平衡文本情感分类数据集设计问题,通过下采样方法,对测试数据集设计了平衡与非平衡两种方案,给出了在不同任务需求下,选择相应的实验方案,并对验证分类器分类性能的评价指标进行了讨论。通过在真实的网络评论数据上的实验,验证了这些方案的合理性和适用性。 With the deep researching of the imbalanced classification problems,how to divide the training data and test data has become a worth considering question.Aiming at the imbalanced text sentiment classification problems,this paper has studied both balanced and imbalanced test data with under sampling methods.Discussed in different mission requirements,how to choose a proper scheme and evaluation index to verify the performance of the classifier.The experiments results indicate that proposed schemes are reasonable and applicative on two real network reviews datasets.
出处 《电脑开发与应用》 2013年第5期1-4,共4页 Computer Development & Applications
基金 国家自然科学基金资助项目(60970014 61272095) 山西省自然科学基金资助项目(2010011021-1) 山西省科技攻关项目(20110321027-02)
关键词 非平衡数据 情感分类 实验设计 imbalanced data sentiment classification experimental design
  • 相关文献

参考文献10

  • 1Yen S J, Lee Y S. Cluster-based Under-sampling Approaches for Imbalanced Data Distributions [J]. Expert Systems with Applications, 2009, 56(3): 5718-5727.
  • 2Garc f a V, S a nchez J S, Mollineda R A. On the Effective- ness of Preprocessing Methods When Dealing with Different Levels of Class imbalance [J]. Knowledge-Based Systems, 2012(25): 13-21.
  • 3Wang Z Q, Li S S, Zhou G D, et al. Imbalanced Sentiment Classification with Multi-strategy Ensemble Learning [C] //Proceedings of the Asian Language Processing (IALP), 2011: 131-134.
  • 4Li S S, Wang Z Q, Zhou G D, et al. Semi-supervised Learning for Imbalanced Sentiment Classification[C]//Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, 2011, 3: 1823-1831.
  • 5Wang S G, Li D Y, Zhao L D, et al. Sample Cutting Method for Imbalanced Text Sentiment Classification Based on BRC [J]. Knowledge-Based Systems, 2013 (37): 451-461.
  • 6Tumey P D. Thumbs up or Thumbs down: Semantic Orienta- tion Applied to U~tpervised Classification of Reviews [C] //ACL ' 02 Proceedings of the 40th Annual Meeting on Asso- ciation for Computational Linguistics, 2002: 417-424.
  • 7Pang B, Lee L, Vaithyanathan S. Thumbs up: Sentiment Clas- sification Using Machine Learning Techniques [C]//EMNLP '02 Proceedings of the ACL-02 conference on Empirical Methods in Natural Language Processing, 2002,10: 79-86.
  • 8Kim S M, Hovy E. Automatic Detection of Opinion Bearing Words and Sentences [C] // Companion Volume of the Pro- ceedings of IJCNLP-05, 2005: 61-66.
  • 9Japkowicz N, Stephen S. The Class Imbalance Problem: a Sys- tematic Study [J]. Intelligent Data Analysis, 2002, 6 (5): 429-449.
  • 10Fan R E, Chen P H, Lin C J. Working Set Selection Using Second Order Information for Training Support Vector Ma- chines [J]. The Journal of Machine Learning Research, 2005 (6): 1889-1918.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部