期刊文献+

面向不平衡分类的固定半径最近邻逐步竞争算法(FRNNPC) 被引量:1

Fixed-radius nearest neighbor progressive competition algorithm for imbalanced classification
原文传递
导出
摘要 许多真实世界的数据集都存在一个称为类不平衡问题的问题。传统的分类算法在对不平衡数据进行分类时,容易导致少数类被错分。为了提高少数类样本的分类准确度,提出了一种基于固定半径最近邻的逐步竞争算法(FRNNPC),通过固定半径邻(FRNN)对数据集进行预处理,在全局范围内消除不必要的数据,在得到的候选数据中使用逐步竞争算法(NPC),即逐渐计算查询样本邻近样本的分值,直到一个类的分值总和高于另一个类。简而言之,该方法能够有效地处理不平衡问题,而且不需要任何手动设置的参数。实验结果将所提出的方法与4种代表性算法在10个不平衡数据集上进行了比较,并验证了该算法的有效性。 There is a problem called class imbalance in many real-world datasets. When traditional classification algorithms classifying imbalanced data, it is easy to misclassify the minority class. In order to improve the classification accuracy of the samples from the minority class, this paper proposes a fixed-radius nearest neighbor progressive competition algorithm(FRNNPC). As a preconditioning, FRNNPC eliminates ineligible samples globally through the fixed-radius nearest neighbor rule, and use the NPC in the obtained candidate data to gradually calculate the score of the nearest neighbor sample of the query sample until the sum of the scores of the one class is higher than another class. In short, this method can effectively deal with the imbalance problem, and does not require any manually set parameters. The experimental results compare the proposed method with four representative algorithms applied to 10 imbalanced data sets, and illustrate the effectiveness of the algorithm.
作者 周鹏 伊静 朱振方 刘培玉 ZHOU Peng;YI Jing;ZHU Zhen-fang;LIU Pei-yu(School of Information Science & Engineering, Shandong Normal University, Jinan 250358, Shandong, China;Shandong ProvincialKey Laboratory for Distributed Computer Software Novel Technology, Jinan 250358, Shandong, China;School of Computer Science & Technology, Shandong Jianzhu University, Jinan 250014, Shandong, China;School of Information Science and Electric Engineering, Shandong Jiaotong University, Jinan 250357, Shandong, China)
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2019年第3期102-109,共8页 Journal of Shandong University(Natural Science)
基金 国家自然科学基金资助项目(61373148 61502151) 教育部人文社科基金资助项目(14YJC860042) 山东省自然科学基金资助项目(ZR2014FL010)
关键词 不平衡数据 最近邻规则 模式分类 imbalanced data nearest neighbors rule pattern classification
  • 相关文献

参考文献1

二级参考文献43

  • 1WU Xin-dong,KUMAR V,QUINLAN J R,et al.Top 10 algorithms in data mining[J].Knowledge and Information Systems,2008,14(1):1-37.
  • 2CHAWLA N V,JAPKOWICZ N,KOTCZ A.Editorial:special issue on learning from imbalanced data sets[J].ACM SIGKDD Explorations Newsletter,2004,6(1):1-6.
  • 3HE Hai-bo,GARCIA E A.Learning from imbalanced data[J].IEEE Trans on Knowledge and Data Engineering,2009,21(9):1263-1284.
  • 4TING K M.A comparative study of cost-sensitive boosting algorithms[C]//Proc of the 17th International Conference on Machine Learning.2000:983-990.
  • 5FAN Wei,STOLFO S J,ZHANG Jun-xin,et al.AdaCost:misclassification cost-sensitive boosting[C]//Proc of the 16th International Conference on Machine Learning.1999:97-105.
  • 6SUN Yan-min,KAMEL M S,WONG A K C,et al.Cost-sensitive boosting for classification of imbalanced data[J].Pattern Recognition,2007,40(12):3358-3378.
  • 7GALAR M,FERNNDEZ A,BARRENCHEA E,et al.EUSBoost:enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling[J].Pattern Recognition,2013,46(12):3460-3471.
  • 8JOSHI M V,KUMAR V,AGARWAL R C.Evaluating boosting algorithms to classify rare classes:comparison and improvements[C]//Proc of IEEE International Conference on Data Mining.Washington DC:IEEE Computer Society,2001:257-264.
  • 9GUO Hong-yu,VIKTOR H L.Learning from imbalanced data sets with boosting and data generation:the DataBoost-IM approach[J].SIGKDD Exploration Newsletter,2004,6(1):30-39.
  • 10FREUND Y,SCHAPIRE R.A desicion-theoretic generalization of on-line learning and an application to boosting[J].Journal of Computer & System Sciences,1997,55(1):119-139.

共引文献73

同被引文献15

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部