期刊文献+

Boosting imbalanced data learning with Wiener process oversampling 被引量:1

Boosting imbalanced data learning with Wiener process oversampling
原文传递
导出
摘要 Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperforms many prevalent imbalance learning solutions. Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperforms many prevalent imbalance learning solutions.
出处 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第5期836-851,共16页 中国计算机科学前沿(英文版)
基金 Acknowledgements This research was partially supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA06030200), the National Natural Science Foundation of China (Grant Nos. M1552006, 61403369, 61272427, and 61363030), Xinjiang Uygur Autonomous Region Science and Technology Project (201230123), Beijing Key Lab of Intelligent Telecommunication Software, Multimedia (ITSM201502), Guangxi Key Laboratory of Trusted Software (kx201418).
关键词 imbalanced-data learning OVERSAMPLING ensemble learning Wiener process ADABOOST imbalanced-data learning, oversampling, ensemble learning, Wiener process, AdaBoost
  • 相关文献

参考文献3

二级参考文献4

  • 1Weiss G Mining with Rarity:A Unifying Framework[C]//Proc.of SIGKDD Explorations,Chicago,IL,USA.2004.
  • 2Schapire R,Singer Y.Improved Boosting Algorithms Using Confidence-rated Predictions[J].Machine Learning,1999,37(3):297-336.
  • 3Chawla N V,Bowyer K W,Hall L O,et al.SMOTE:Synthetic Minority Over-sampling Technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
  • 4Blake C,Merz C.UCI Repository of Machine Learning Databases[Z].1998.http://www.ics.uci.edu/-mlearn/MLRepository.html.

共引文献15

同被引文献3

引证文献1

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部