期刊文献+

基于不平衡类数据集分类的空间插值方法 被引量:2

S-SMOTE Method in Class Imbalance Data Sets
下载PDF
导出
摘要 对于不平衡类数据集的分类问题,训练分类器后,分类结果产生了向多数类偏斜的问题,少数类识别率较低。为了提高少数类的分类精度,提出了一种改进的SMOTE方法—空间插值方法,利用少数类及其k近邻构造超几何体,在超几何体内随机产生虚拟少数类样本,当其k近邻中存在多数类时,缩小构造虚拟样本的空间,加强对易错分样本的训练,降低数据集类不平衡程度,并进行有效性验证。在实际数据集上,基于多个分类器进行仿真,结果表明,空间插值法在少数类和数据集整体分类性能优化效果较好。 Analyzing the problem that the classification results is always biased to the majority class in class imbalance data sets. An improved method of SMOTE Called Space - Synthetic Minority Over - sampling Technique( S - SMOTE) was proposed. A super geometry based on the minority class and its k nearest neighbors was constructed. New synthetic samples were generated inside the super geometry. The production space was reducing to avoid the noise if some of its k nearest neighbors belongs to majority class. The training of minority class samples that are hardly classified was strengthen. Then the validity of the virtual samples was confirmed. Based on the real data sets, the experiments show that this method performes better than SMOTE for the classification performance of minority class and the whole data set.
作者 董璇 蔡立军
出处 《计算机仿真》 CSCD 北大核心 2012年第12期175-179,共5页 Computer Simulation
关键词 类不平衡 超几何体 过抽样 样本生成 Class imbalance Super geometry Over - sampling Generate samples
  • 相关文献

参考文献7

  • 1G Weiss. Mining with Rarity:A Unifying Framework[J].SIGKDD Explorations,2004,(07):7-19.
  • 2Pang Ning Tan;Michael Steinbach.数据挖掘导论[M]北京:人民邮电出版社,20060519-20.
  • 3陈青,薛惠锋,杜喆.非均衡数据的最小二乘支持向量机阈值新算法[J].计算机仿真,2011,28(3):219-221. 被引量:2
  • 4Gustavo E A,P A Batista. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data[J].SIGKDD Explorations,.
  • 5I Tomek. Two Modi-cations of CNN[J].EEE Transactions on Systems Man and Communications,1976.769-772.
  • 6N Chawla. SMOTE:Synthetic Minority Over-sampling echnique[J].Journal of Artificial Intelligence Research,2002.321-357.
  • 7石铁峰.支持向量机在电子邮件分类中的应用研究[J].计算机仿真,2011,28(8):156-158. 被引量:7

二级参考文献17

  • 1李惠娟,高峰,管晓宏,黄亮.基于贝叶斯神经网络的垃圾邮件过滤方法[J].微电子学与计算机,2005,22(4):107-111. 被引量:21
  • 2王波,黄迪明.遗传神经网络在邮件过滤器中的应用[J].电子科技大学学报,2005,34(4):505-508. 被引量:9
  • 3王新梅,卢苇,尹朝庆,吕亚兵.基于文本挖掘的邮件分类与过滤[J].计算机工程与应用,2006,42(2):135-137. 被引量:6
  • 4潘正强,周经伦,郑龙.基于LS-SVM预报的动态矩阵预测控制[J].计算机仿真,2007,24(4):170-171. 被引量:1
  • 5J A K Suykens, J Vandewalle. Least Square Support Vector Machine Classifers[J]. Neural Processing Letters 1999,9:293-300.
  • 6V Vapnik. The Nature of Statistical Learning Theory [ M ]. New York : Springer, 1995.
  • 7K S Chua. Efficient computations for large least square support vector machine classifiers[ J ]. Pattern Recognition Letters, 2003, 24:75-80.
  • 8Zhou H-gang, Lai N-Keung, Yu Lean. Least squares support vector machines ensemble models for credit scoring [ J ]. Expert Systems with Applications, 2010,37 ( 1 ) : 127-133.
  • 9N V Chawla, K W Bowyer, L O Hall, Kege Lmeyer, W P Smote. Synthetic Minority Over-Sampling Technique[J]. Journal of Artificial Intelligence Research, 2002,16 (3) : 321-357.
  • 10A Rehan, K Stephen, J Nathalie. Applying Support Vector Machines to hnbalanced Datasets[ A ]. J-F Boulicaut et al. (Eds.) : 15th European Conference on Machines Learning[C]. LNAI3201, Spfinger-Verlag 2004.39-50.

共引文献7

同被引文献24

  • 1温津伟,罗四维,赵嘉莉,黄华.通过创建虚拟样本的小样本人脸识别统计学习方法[J].计算机研究与发展,2002,39(7):814-818. 被引量:9
  • 2Yang M H, Kriegman D, Ahuia N. Detecting {aces in images:A survey. INEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(1) :34-58.
  • 3Turk M,Pentland A. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 1991, 3 (1) : 71-86.
  • 4Kirby M, SIrovich L. Application of the Karhunen-Loeve procedure for the characteriza- tion of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12(1) : 1024108.
  • 5Belhumeur P N, Hespanha J P, Kriegman D J. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(7) ..711-720.
  • 6Yang J,Yang J Y. From image vector to matrix: A straightforward image projection technique- IMPCA vs. PCA. Pattern Recognition, 2002, 35 (9) :1997-1999.
  • 7Yang J, Zhang D, Frangi A F, et al. Two- dimensional PCA:A new approach to face repre- sentation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(1) :131-137.
  • 8Poggio T, Vetter T. Recognition and structure from one 2D model view: Observations on prototypes, object classes, and symmetries. Technical Report. A. I. Memos No. 1347, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 1992.
  • 9Lu J W,Tan Y P,Wang G. Discriminative multi- manifold analysis for face recognition from a single training sample per person. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013,35(1) :39- 51.
  • 10王和勇,樊泓坤,姚正安,李成安.不平衡数据集的分类方法研究[J].计算机应用研究,2008,25(5):1301-1303. 被引量:25

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部