期刊文献+

不平衡数据集的分类方法研究 被引量:23

Research of imbalanced data classification
下载PDF
导出
摘要 传统的分类算法在处理不平衡数据分类问题时会倾向于多数类,而导致少数类的分类精度较低。针对不平衡数据的分类,首先介绍了现有不平衡数据分类的性能评价;然后介绍了现有常用的基于数据采样的方法及现有的分类方法;最后介绍了基于数据采样和分类方法结合的综合方法。 Imbalanced data set cause the deduction of the precision of the minority class samples, when it is classified by traditional algorithm, which can tend to favor the more class samples. In view of the imbalanced data classification, this paper firstly introduced the developed methods that were the performance evaluation of imbalaneed data classification. Secondly it presented the developed sampling methods regarding imbalaneed data set and produced the classified methods, In the end, it showed the union methods of using sampling method and classified method.
出处 《计算机应用研究》 CSCD 北大核心 2008年第5期1301-1303,1308,共4页 Application Research of Computers
关键词 机器学习 不平衡数据 数据分类 machine learning imbalaneed data set data classification
  • 相关文献

参考文献29

  • 1EZAWA K J, SINGH M, NORTON S W. Learning goal oriented Bayesian networks for telecommunications management [ C ]//Proc of the 13th International Conference on Machine Learning. San Fransisco: Morgan Kaufmann, 1996:139-147.
  • 2CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE:synthetic minority over-sampling technique[ J ]. Journal of Artificial Intelligence Research, 2002,16:321-357.
  • 3KUBAT M, HOLTE R, MATWIN S. Machine learning for the detection of oil spills in satellite radar images [ J ]. Machine Learning, 1998,30(2) :195-215.
  • 4BOSCH A T, HERIK H J, DAELEMANS W. When small disjuncts abound, try lazy learning: a case study[ C ]//Proc of the 7th Belgian- Dutch Conference on Machine Learning. 1997 : 109-118.
  • 5ZHENG Zhao-hui, WU Xiao-yun, SRIHARI R. Feature selection for text categorization on imbalanced data[ J ]. SIGKDD Explorations, 2004,6( 1 ) :80-89.
  • 6FAWCETT T, PROVOST F. Combining data mining and machine learning for effective user profile [ C ]//Proc of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland: AAAI Press, 1996:8-13.
  • 7JAPKOWICZ N. Learning form imbalanced data sets : a comparison of various strategies, WS-00-05 [ R]. Menlo Park: AAAI Press, 2000.
  • 8CHAWLA N V, JAPKOWICZ N, KOLCZ A. Proceedings of the ICML workshop on learning from imbalanced data sets[ C]. 2003.
  • 9CHAWLA N V, JAPKOWICZ N, KOLCZ A. Editorial: special issue on learning from imbalanced data sets[J]. ACM SIGKDD Exploration Newsletter, 2004,6( 1 ) : 1-6.
  • 10BRADLEY A. The use of the area under the ROC curve in the evaluation of machine learning algorithms [ J ]. Pattern Recognition, 1997,30(6) : 1145-1159.

同被引文献238

引证文献23

二级引证文献126

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部