期刊文献+

一种改进的少数类样本识别方法 被引量:1

An improved method on identification of minority class sample
下载PDF
导出
摘要 非均衡数据集的分类过程中,产生了向多数类偏斜、少数类识别率较低的问题。为了提高少数类的分类精度,提出了一种S-SMO-Boost方法。该方法基于Adaboost提升算法迭代过程中错分少数类样本,构造虚拟样本,以加强对易错分样本的训练;其中构造样本利用空间插值方法,即在错分少数类样本周围构造超几何体,在该超几何体内部空间随机插值产生有效虚拟样本。在实际数据集上进行实验验证,结果表明,S-SMO-Boost方法提高了非均衡数据集的分类性能。 Analyzing the problem that the classification results is always biased to the majority class in imbalanced data sets. An improved method S-SMO-Boost is proposed. Based on the minorities which are misclassified in the iterative process of Adaboost algorithm, virtual samples are constructed to strengthen the training of minority class samples that are hardly classified .A method called S-SMOTE is used to construct a super geometry based on the minority class samples and its k nearest neighbors. The new virtual samples are generated inside the super geometry. Based on the real data sets, the experiments show that S-SMO- Boost improved the classification performance of imbalanced data sets.
作者 董璇 蔡立军
出处 《微型机与应用》 2012年第18期60-62,65,共4页 Microcomputer & Its Applications
关键词 非均衡数据集 超几何体 样本生成 提升算法 imbalanced data sets super geometry generate samples boosting algorithm
  • 相关文献

参考文献6

  • 1WEISS G.Mining with rarity:an unifying framework[J]. Sigkdd Explorations, 2004,6(7) : 7-19.
  • 2李亚军,刘晓霞,陈平.改进的AdaBoost算法与SVM的组合分类器[J].计算机工程与应用,2008,44(32):140-142. 被引量:8
  • 3TOMEK I.Two modi-cations of CNN[J].IEEE Transactions on Systems Man and Communications, 1976,SMC-6:769- 772.
  • 4MANNILA, LIU, MOTODA.Adavances in instance selection for instance-based leaning algorithms[J].Data Mining and Knowledge Discovery, 2002(6) : 153-172.
  • 5CHAWLA N, BOWYER K, HALL L, et al.SMOTE : synthetic minority over-sampling echnique[J].Journal of Artificial Intelligence Research,2002(16) : 321-357.
  • 6BLAKE C, MERZ C.UCI repository of machine learning databases [DB/OL]. 1998. http ://archive.ies. uci.edu/ml/.

二级参考文献6

  • 1王元珍,乐树彬.基于MultiBoost的最小分类误差算法[J].小型微型计算机系统,2005,26(11):1948-1950. 被引量:2
  • 2琚旭,王浩,姚宏亮.基于Boosting的支持向量机组合分类器[J].合肥工业大学学报(自然科学版),2006,29(10):1220-1222. 被引量:7
  • 3董乐红,耿国华,周明全.基于Boosting算法的文本自动分类器设计[J].计算机应用,2007,27(2):384-386. 被引量:13
  • 4Tan Pang-Ning,Steinbach M,Kumar V.Introduction to data mining[M].[S.l.]:Posts & Telecom Publishers Inc,2006.
  • 5Chew Hong-Gunn,Crisp D J,Bogner R E,et al.Target detection in radar imagery using support vector machines with training size biasing[C]//Proceedings of the Sixth International Conference on Control,Automation,Robotics and Vision,Singapore,2000.
  • 6Joshi M V,Agarwal R C,Kumar V.Predicting rare classes:Can boosting make any weak learner strong? [C]//Proceedings of the Eighth ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD2002), Edmonton, Canada, 2002.

共引文献7

同被引文献7

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部