期刊文献+

基于采样技术的主动不平衡学习算法研究 被引量:2

Study of active learning algorithms on imbalanced data based on sampling technique
下载PDF
导出
摘要 针对在不平衡分布数据中执行主动学习,其分类面容易形成偏倚,从而导致主动学习失效这一问题,拟采用采样技术作为学习过程的平衡控制策略,在调查了几种已有的采样算法的基础上,提出了一种边界过采样算法,并将其与主动学习相结合。此外,考虑到极限学习机所具有的泛化能力强、训练速度快等优点,拟采用其作为基分类器,来加速主动学习的进程。通过12个基准数据集对加入平衡控制策略的主动学习算法的性能进行了验证,结果表明:在不平衡场景下,主动学习确实会受到其负面影响,且引入了采样技术的主动学习算法性能明显更优。 To solve the problem that the classification hyperplane tends to be biased towards majority class during conducting active learning in the class imbalanced data, further makes active learning lose efficacy, instance sampling technique is considered as balance control strategy of active learning. First,the characteristics of various sampling algorithms are investigated. Then, a novel boundary oversampling algorithm is proposed. They are considered to be used as balance control strategies for active learning. In addition, we try to implement active learning by using Extreme Learning Machine(ELM) as basic classifier according to two reasons as follows: it has strong generalization ability and it has a faster training speed. The experiments were conducted on 12 benchmark data sets, indicating the effectiveness and feasibility of the proposed improved active learning algorithm. Also, the experimental results show that the active learning can be indeed negatively affected by skewed data distribution, as well the active learning algorithms with instance sampling can produce better performance.
出处 《电子设计工程》 2018年第1期7-12,19,共7页 Electronic Design Engineering
基金 国家自然科学基金(61305058) 江苏省自然科学基金(BK20130471) 中国博士后科学基金(2013M540404)
关键词 类别不平衡 主动学习 极限学习机 样本采样 边界过采样 class imbalance active learning extreme learning machine instance sampling boundary over-sampling
  • 相关文献

同被引文献24

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部