期刊文献+

面向不平衡数据分类的KFDA-Boosting算法 被引量:9

KFDA-Boosting algorithm oriented to imbalanced data classification
下载PDF
导出
摘要 数据分布的不平衡性和数据特征的非线性增加了分类的困难,特别是难以识别不平衡数据中的少数类,从而影响整体的分类效果。针对该问题,结合KFDA(kernel Fisher discriminant analysis)能有效提取样本非线性特征的特性和集成学习中Boosting算法的思想,提出了KFDA-Boosting算法。为了验证该算法对不平衡数据分类的有效性和优越性,以G-mean值、少数类的查准率与查全率作为分类效果的评价指标,选取了UCI中10个数据集测试KFDA-Boosting算法性能,并与支持向量机等六种分类算法进行对比实验。结果表明,对于不平衡数据分类,尤其是对不平衡度较大或呈非线性特征的数据,相比于其他分类算法,KFDA-Boosting算法能有效地识别少数类,并且在整体上具有显著的分类效果和较好的稳定性。 The imbalance of data distribution and the nonlinearity of data characteristics increase the difficulty of classification,e specially the recognition of the minority class samples in the imbalanced data,thus affecting the overall classification effect.For the above problem,this paper proposed an algorithm called KFDA-Boosting,which combined the characteristic of KFDA,namely kernel fisher discriminant analysis,effectively extracted the samples’nonlinear features and the idea of Boosting algorithm in the ensemble learning.In order to verify the effectiveness and superiority of the algorithm in the classification of imbalanced data,the paper used the G-mean value,the precision and recall of the minority class samples to evaluate the performance of classifier,and selected 10 datasets of UCI to test the KFDA-Boosting algorithm,which compared with other six algorithms,such as support vector machine.Compared with other algorithms,the results show that the algorithm can effectively identify the minority class,and has a significant effect on the classification of imbalanced data and better stability on the whole,especially for the data with larger unbalance degree or nonlinear characteristics.
作者 王来 樊重俊 杨云鹏 袁光辉 Wang Lai;Fan Chongjun;Yang Yunpeng;Yuan Guanghui(Business School,University of Shanghai for Science&Technology,Shanghai 200093,China;School of Information Management&Engineering,Shanghai University of Finance&Economics,Shanghai 200433,China;Experimental Center,Shanghai University of Finance&Economics,Shanghai 200433,China)
出处 《计算机应用研究》 CSCD 北大核心 2019年第3期807-811,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(71303157) 上海市教育委员会科研创新重点基金项目(14ZZ131) 上海市一流学科资助基金项目(S1205YLXK) 上海市社科规划青年课题基金项目(2014EGL007) 沪江基金资助项目(D14008)
关键词 核费希尔判别分析 集成学习 不平衡数据 分类 kernel Fisher discriminant analysis ensemble learning imbalanced data classify
  • 相关文献

参考文献13

二级参考文献201

共引文献280

同被引文献123

引证文献9

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部