期刊文献+

Transfer synthetic over-sampling for class-imbalance learning with limited minority class data 被引量:3

原文传递
导出
摘要 The problem of limited minority class data is encountered in manyclass imbalanced applications, but has received little attention. Synthetic over-sampling, as popular class-imbalance learning methods, could introduce much noise when minority class has limited data since the synthetic samples are not i.i.d. samples of minority class. Mos t sophisticated synthetic sampling methods tackle this problem by denoising or generating samples more consistent with ground-truth data distribution. But their assumptions about true noise or ground-truth data distribution may not hold. To adapt synthetic sampling to the problem of limited minority class data, the proposed Traso framework treats synthetic minority class samples as an additional data source, and exploits transfer learning to transfer knowledge from them to minority class. As an implementation, TrasoBoost method firstly generates synthetic samples to balance class sizes. Then in each boosting iteration, the weights of synthetic samples and original data decrease and increase respectively when being misclassified, and remain unchanged otherwise. The misclassified synthetic samples are potential noise, and thus have smaller influence in the following iterations. Besides, the weights of minority class instances have greater change than those of majority class instances to be more influential. And only original data are used to estimate error rate to be immune from noise. Finally, since the synthetic samples are highly related to minority class, all of the weak learners are aggregated for prediction. Experimental results show TrasoBoost outperforms many popular class-imbalance learning methods.
出处 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第5期996-1009,共14页 中国计算机科学前沿(英文版)
基金 The authors wish to thank the associate editor and anonymous reviewers for their helpful comments and suggestions. This work was supported by the National Key R&D Program of China (2017YFB1002801) the National Natural Science Foundation of China (Grant Nos. 61473087, 61573104) the Natural Science Foundation of Jiangsu Province (BK20141340), and partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization.
  • 相关文献

同被引文献13

引证文献3

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部