期刊文献+

一种用于非平衡数据的SVM学习算法 被引量:7

SVM Learning Algorithm Used in Imbalance Data
下载PDF
导出
摘要 在实际应用中的分类数据往往是非平衡数据,少数类别的数据可能有很大的分类代价。分类性能不仅要考虑分类精度,同时要考虑分类代价。该文扩展了支持向量机(SVM)学习方法,对于以高斯核为核函数时的少数类和多数类使用不同的惩罚参数C+,C-以获得高敏感度的超平面,并提出利用遗传算法对SVM的学习参数进行优化调整。给出一种新的评价函数,对分类结果的质量进行评价。实验结果证明,算法对于非平衡数据的分类有较好的效果,对少数类样本预测的准确性较高。 In practice, training data is usually imbalanced, one class is "rare" relative to the other, and misclassification cost of the rare class may be much greater than the cost of the other class. In this situation, accuracy and the misclassification cost should be considered. This paper extends the Support Vector Machine(SVM) learning method, based on the Gauss kernel, by the use of C+( the weight assigned to the rare class), and C (the weight assigned to the other class)to train more sensitive hyperplane, which is optimized by generic algorithm. Meanwhile, a new sensitive quality measure function is introduced in the optimization process. Experimental results show that the optimized algorithm has competitive performance when dealing with the rare class in the imbalance training data.
作者 蒋莎 张晓龙
出处 《计算机工程》 CAS CSCD 北大核心 2008年第20期198-199,202,共3页 Computer Engineering
关键词 支持向量机 非平衡数据 评价函数 学习参数优化 Support Vector Machine(SVM) imbalance data measure function learning parameters optimization
  • 相关文献

参考文献4

  • 1Vapnik V N. The Nature of Statical Learning Theory[M]. New York, USA: Spfinger-Verlag, 1995.
  • 2张琦,吴斌,王柏.非平衡数据训练方法概述[J].计算机科学,2005,32(10):181-186. 被引量:10
  • 3Musicant D, Kumar V, Ozgur A. Optimizing P-measure with Support Vector Machines[C]//Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference. Florida, USA: AAAI Press, 2003: 356-360.
  • 4Morik K, Brockhausen P, Joachims T. Combining Statistical Learning with a Knowledge-based Approach A Case Study in Intensive Care Monitoring[C]//Proceedings of the International Conference on Machine Learning. San Diego, CA, USA: [s. n.], 1999.

二级参考文献20

  • 1Weiss G M. Mining with Rarity: A Unifying Framework. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 2004,6(1)
  • 2Guo Hongyu, Viktor Herna L. Learning from Imbalanced Data Sets with Boosting and Data Generation: The DataBoost-IM Approach. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 2004,6(1)
  • 3Raskutti B, Kowalczyk A. Extreme Rebalancing for SVMs: a case study. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 2004,6 (1)
  • 4Batista G E A P A, Prati R C, Monard M C. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 2004,6( 1 )
  • 5Jo T,Japkowicz N. Class Imbalance versus Small Disjuncts. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 2004, 6 (1)
  • 6Phua C, Alahakoon D, Lee V. Minority Report in Fraud Detection:Classification of Skewed Data. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining,2004, 6(1)
  • 7Petrushin V A,Kao A,Khan L. The 4th Intl. Workshop on Multimedia Data Mining(MDM/KDD2003), Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining,2004,6(1)
  • 8Dolores del Castillo M,Serrano Jose lgnacio. A Multistrategy Approach for Digital Text Categorization from Imbalanced Documents. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 2004,6 (1)
  • 9Zheng Zhaohui,Wu Xiaoyun,Srihari Rohini. Feature Selection for Text Categorization on Imbalanced Data. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining,2004,6(1)
  • 10Huang K,et al. Learning Classifiers from Imbalanced Data Based on Biased Minimax Probability Machine, Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proc. of the 2004 IEEE Computer Society Conf. on ,2004,2: Ⅱ-558~Ⅱ -563

共引文献9

同被引文献60

  • 1陈丽,陈静.基于支持向量机和k-近邻分类器的多特征融合方法[J].计算机应用,2009,29(3):833-835. 被引量:14
  • 2王成山,王继东.基于小波包分解的电能质量扰动分类方法[J].电网技术,2004,28(15):78-82. 被引量:68
  • 3Vapnik V. Statictical Learning Theory[M]. New York, USA: Wiley, 1998.
  • 4Tang Yuchun. Granular Support Vector Machines Based on Granular Computing, Soft Computing and Statistical Learning[D]. Atlanta, USA: Georgia Stage University, 2006.
  • 5Yao Y Y. On Modeling Data Mining with Granular Computing[C]// Proc. of the 25th Annual International Conference on Computer Software and Applications. Chicago, USA: [s. n.], 2001.
  • 6Kubat M, Matwin S. Addressing the Curse of Imbalanced Training Sets: One-sided Selection[C]//Proc. of the 14th International Conference on Machine Learning. Nashville, Tennessee, USA: [s. n.], 1997.
  • 7Japkowicz N, Stephen S. The Class Imbalance Problem: A Systematic Study[J]. Intelligent Data Analysis, 2002, 6(5): 429-449.
  • 8Elkan C. The Foundation of Cost-sensitive Learning[C]//Proc. of IJCAI'01. Seattle, USA: [s. n.], 2001.
  • 9Tax D M J. One-class Classification[D]. Delft, The Netherlands: Delfl University of Technology, 2001.
  • 10BLAKE C, MERZ C. UCI repository of machine learning data bases [EB/OL]. [ 2011-03-25]. hnp://www, ics. uei. edu/- mlearn/- MLRepository. html.

引证文献7

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部