摘要
少数类样本合成过采样技术(SMOTE)是一种典型的过采样数据预处理方法,它能够有效平衡非均衡数据,但会带来噪音等问题,影响分类精度。为解决此问题,借助主动学习支持向量机的分类性能,提出一种基于主动学习SMOTE的非均衡数据分类方法 ALSMOTE。由于主动学习支持向量机采用基于距离的主动选择最佳样本的学习策略,因此能够主动选择非均衡数据中的有价值的多数类样本,舍弃价值较小的样本,从而提高运算效率,改进SMOTE带来的问题。首先运用SMOTE方法均衡小部分样本,得到初始分类器;然后利用主动学习策略调整分类器精度。实验结果表明,该方法有效提高了非均衡数据的分类准确率。
Synthetic Minority Over-sampling Technique(SMOTE) is a typical over-sampling data preprocessing method which can effectively balance the imbalanced data.However,it brings about noise as well as other problems,so that the classification accuracy is downgraded.To solve the problem,with the help of the classification performance of active learning SVM,an imbalance data classification approach,called ALSMOTE,which is based on active learning SMOTE,is proposed.Since active learning SVM relies on distance-based active selection optimal samples learning strategies,it can actively choose from imbalanced data the valuable majority class samples by discarding valueless samples,so as to enhance operational efficiency and mitigate the problems brought about by SMOTE.First of all SMPTE approach is used to balance a small part of samples to obtain an initial classification;then active learning strategies are followed to adjust the classification accuracy.Experimental results show that the proposed method can effectively improve the imbalanced data's classification accuracy.
出处
《计算机应用与软件》
CSCD
北大核心
2012年第3期91-93,162,共4页
Computer Applications and Software
基金
国家自然科学基金项目(10771092)
辽宁省科技厅博士启动基金项目(20081079)
大连市科学技术基金项目(2010J21DW019)