期刊文献+

不平衡支持向量机的平衡方法 被引量:15

Balance Method for Imbalanced Support Vector Machines
原文传递
导出
摘要 针对支持向量机中两类不平衡数据的分离超平面提出一种调整算法.首先用标准的支持向量机对原始数据进行初步训练,产生一个分离超平面的法向量.然后把高维样本投影到该法向量上得到一维数据.最后由投影数据的标准差以及样本容量所提供的信息,给出两类数据惩罚因子比例,再用标准的支持向量机进行第2次训练,从而得到一个新的分离超平面.实验显示该方法的有效性,即在一般情况下能平衡错分率,甚至还能减少错分率. An adjustment method is proposed for the separation hyperplane of binary-classification imbalanced data. Firstly, the original samples are preliminarily trained by the standard support vector machines, and a normal vector of the separation hyperplane is obtained. Secondly, one-dimensional data are generated by projecting the high dimensional data onto the normal vector. Then, the ratio of the two-class penalty factors is determined based on the information derived from the standard deviation of the projective data and the two-class sample sizes. Finally, a new separation hyperplane is presented by the second training. Experimental results show the efficiency, i. e. , the two error ratios can be balanced and even be decreased generally.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2008年第2期136-141,共6页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金资助项目(No.60574075)
关键词 不平衡数据 特征提取 支持向量机(SVM) 投影 标准偏差 Imbalanced Data, Feature Extraction, Support Vector Machines (SVM),Projection, Standard Deviation
  • 相关文献

参考文献12

  • 1Vapnik V N. The Nature of Statistical Learning Theory. New York, USA: Springer-Verlag, 1995
  • 2Japkowicz N, Stephen S. The Class Imbalanced Problem: A Systematic Study. Intelligent Data Analysis, 2002, 6(5): 429- 449
  • 3Chawla N V, Bowyer K W, Hall L O, etal. Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 2002, 16(3):321-357
  • 4Kubat M, Matwin S. Addressing the Curse of Imbalaneed Datasets: One-Sided Sampling//Proe of the 14th International Conference on Machine Learning. Nashville, USA, 1997, 178-186
  • 5Rehan A, Stephen K, Nathalie J. Applying Support Vector Machines to Imbalaneed Datasets // Proe of the 15th European Conference on Machines Learning. Pisa, Italy, 2004:39-50
  • 6Barandela R, Valdovinos R M, Sanchez J S, et al. The Imbalanced Training Sample Problem: Under or over Sampling? //Proc of the Joint IAPR International Workshops on Structural, Syntactic, and Statistical Pattern Recognition. Lisbon, Portugal, 2004: 806-814.
  • 7I.in Y, Lee Y, Wahba G. Support Vector Machines for Classification in Nonstandard Situations. Machine Learning, 2002, 46 (1/2/3) : 191-202
  • 8Barandela R, Sanchez J S, Garcia V, et al. Strategies for Learning in Class Imbalance Problems. Pattern Recognition, 2003, 36(3):849-851
  • 9郑恩辉,李平,宋执环.不平衡数据知识挖掘:类分布对支持向量机分类的影响[J].信息与控制,2005,34(6):703-708. 被引量:17
  • 10Tao Qing, Wu Gaowei, Wang Feiyue, et al. Posterior Probability Support Vector Machines for Unbalanced Data. IEEE Transon Neural Networks, 2005, 16(6):1561-1573

二级参考文献13

  • 1Vapnik V N. The Nature of Statistical Learning Theory [ M ].New York, USA: Springer-Verlag, 1999.
  • 2Burges C. A tutorial on support vector machines from pattern recognition [J]. Data Mining and Knowledge Discovery, 1998, 2 (2) : 121 -167.
  • 3Ge M, Zhang G C, Xu Y S, et al. Fault diagnosis using support vector machine with an application in sheet metal stamping operations [J]. Mechanical Systems and Signal Processing, 2004, 18(1): 143-159.
  • 4Chang R F, Wu W J, Woo K M. Support vector machines for diagnosis of breast tumors on US images [ J ]. Academic Radiology, 2003, 10(2): 189-197.
  • 5Kim H C, Pang S N, Je H M, et al. Constructing support vector machine ensemble [ J]. Pattern Recognition, 2003, 36 ( 12 ) :2757-2767.
  • 6Weiss G M, Provost F. The Effect of Class Distribution on Classifier Learning: an Empirical Study [ R]. New Jersey: Rutgers University, 2001.
  • 7Japkowicz N, Stephen S. The class imbalance problem : a systematic study [ J ]. Intelligent Data Analysis, 2002, 6 ( 5 ) : 429-449.
  • 8Domingos P. Metacost: a general method for making classifiers cost sensitive [ A]. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining [ C ]. San Diego,CA: ACM Press, 1999. 155-164.
  • 9Drummond C, Holte R. Exploiting the cost (in) sensitivity of decision tree splitting criteria [ A]. Proceedings of the 17th International Conference on Machine Learning [ C ]. USA: Morgan Kanfraann, 2000. 239 -246.
  • 10Chew H-G, Crisp D J, Bogtler R E, et al. Target detection in radar imagery using support vector machine8 with training size biasing [A]. Proceedings of the Sixth International Conference on Control, Automation, Robotics and Vision [ M/CD]. Singapore :2000.

共引文献16

同被引文献86

引证文献15

二级引证文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部