期刊文献+

基于样本特性欠取样的不均衡支持向量机 被引量:25

Support vector machine for unbalanced data based on sample properties under-sampling approaches
原文传递
导出
摘要 针对传统支持向量机在数据失衡的情况下分类效果很不理想的问题,提出一种基于样本特性欠取样的不均衡SVM分类算法.该算法首先在核空间中依据样本信息量选择一定比例的靠近不均衡分类界面的多数类样本;然后根据样本密度信息选择最具有代表性的均衡多数类样本点,在减少多数类样本的同时使分类界面向多数类方向偏移.实验结果表明,所提出的算法与其他不均衡数据预处理方法相比,能有效提高SVM算法在失衡数据中少数类的分类性能、总体分类性能和鲁棒性. The classification result of classical support vector machine algorithm in the case of unbalanced data set is not satisfactory.Therefore,a under-sampling algorithm based on sample properties is presented.According to sample information in the kernel space,a certain percentage of majority instances located near the classification interface are selected.Then according to the sample’s density,the representive majority samples in the selected samples are selected,which can not only reduce the number of majority instances,but also make the SVM classification interface bias toward the majority instances.The experimental results show that compared with other data-preprocess methods for unbalanced dataset classification,the proposed method can improve the classification performance of SVM in the minority class data,the overall classification performance and robustness.
出处 《控制与决策》 EI CSCD 北大核心 2013年第7期978-984,共7页 Control and Decision
基金 国家自然科学基金面上项目(61074076) 中国博士后科学基金项目(20090450119) 中国博士点新教师基金项目(20092304120017) 黑龙江省博士后基金项目(LBH-Z08227)
关键词 不均衡数据 支持向量机 样本特性 欠取样 unbalanced data support vector machine sample properties under-sampling
  • 相关文献

参考文献18

  • 1Vapnik V N. The nature of statistical learning theory[M] .New York: Springer,2000: 138-167.
  • 2He H B, Edwardo A. Learning from imbalanced data[J] .IEEE Trans on Knowledge and Data Engineering, 2009,21(8): 1263-1284.
  • 3Liu X Y, Zhou Z H. Exploratory under-sampling for class-imbalance learing[J] . IEEE Trans on Systems, Man andCybernetics, 2009, 39(2): 539-550.
  • 4Liu X Y, Zhou Z H. Training cost-sensitive neural networkswith methods addressing the class imbalance problem[J] .IEEE Trans on Knowledage and Data Engineering,2006,18(1): 63-77.
  • 5Van H J, Khoshgoftaar T M,Napolitano A. Experimentalperspectives on learning from imbalanceed data[C] . Proc ofthe 24th Int Conf on Machine Learning. New York: ACM,2007: 143-146.
  • 6Weiss G M. Mining with rarity: A unifying framework[J] .ACM SIGKDD Explorations Newsletter,2004,6(1): 7-19.
  • 7Estabrooks A, Jo T. A multiple resampling method forlearning from imbalanced data sets[J] . ComputationalIntelligence, 2004, 20(11): 18-36.
  • 8Han H,Wang W Y, Mao B H. Borderline-SMOTE: A newover-sampling method in imbalanced data sets leaming[C] .Proc Int Conf of Intelligent Computing. Berlin Heidelberg:Springer, 2005: 878-887.
  • 9Akban I R, Kwek S, Japkow I. Applying support vectormachines to imbalanced datasets[C] . Proc of the 15thEuropean Conf on Machines Learning. Berlin Heidelberg:Springer, 2004: 39-50.
  • 10Bastista G E, Prati R C, Monard M C. A study ofthe Behavior of several methods for balancing machinelearning training data[J] . ACM SIGKDD ExplorationNewsletter, 2004’ 6(1): 20-29.

二级参考文献79

共引文献128

同被引文献261

引证文献25

二级引证文献296

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部