期刊文献+

针对不平衡数据集的维数约简方法

Implementing Dimension Reduction on Unbalanced Data sets
下载PDF
导出
摘要 针对不平衡数据的分类问题,本文提出了一种新的方法,将特征选择应用在不平衡数据集中,首先对数据集进行预处理,然后从特征选择的角度出发,选择具有较强能力代表数据集的特征,简化数据的同时也提高了分类性能。通过实验表明,该方法能够有效地提高分类精度。 This paper proposes a new method for imbalanced data classification. After the unbalanced data set is preprocessed by implementing feature selection, some features with strong data representative capabilities are left, and classifiers are constructed on the preprocessed dataset. Experimental results show this approach improves the classification accuracy for unbalanced datasets.
出处 《信息技术与信息化》 2011年第5期62-64,共3页 Information Technology and Informatization
关键词 不平衡数据集 特征选择 聚类 Unbalanced dataset Feature selection Clustering
  • 相关文献

参考文献10

  • 1Ezawak J, Singh M, Norton S W. Learning goal oriented Bayesian networks for telecommunica- tions management[ C] //Proc of the 13th International Conference on Machine Learning. San Francisco: Morgan Kaufmann, 1996: 139- 147.
  • 2Chawlan V, Bowyer K W, Halllo, et al. SMOTE: synthetic minority oversampling technique [ J ]. Journal of Artificial Intelligence Research, 2002,16 : 321 - 357.
  • 3Tomek I Tow modifications of CNN [ J ]. IEEE Trans on Systems Man and Communications, 1976, 6:769 - 772.
  • 4Wilson D L. Asymptotic properties of nearest neighbor rules using edited data [ J ]. IEEE Trans on Systems, Man and Communications, 1972, 2(3): 408-421.
  • 5Gustavo E A, Batista P A, Renaldo C, et al A study of the behavior of several methods for bal- ancing machine learning training data [ J ]. S IGKDD Explorations, 2004,6 ( 1 ) : 20 - 29.
  • 6Misra S, REISSLEIN M, XUE Guo -liang. A survey of multimedia streaming in wireless sen- sor networks [ EB/OL]. Http ://www. Fulton. Asu. edu/mre/WSNstreaming. Pdf.
  • 7Veres A, Campbellia, BARRY M, et al. Sup- porting service differentiation in wireless packet networks using distributed control [ J ]. IEEE Journal on Selected Areas in Communications, 2001, 19(10) : 2081 -2093.
  • 8凌锦江,陈兆乾,周志华.基于特征选择的神经网络集成方法[J].复旦学报(自然科学版),2004,43(5):685-688. 被引量:11
  • 9Joliffe I T. Principal Component Analysis. New York : Springer - Verlag, 2002.
  • 10Hart J W, Kamber M. Data Mining: Concepts and Techniques. 2nd ed. , San Francisco: Mor- gan Kaufmann Publishers, 2001:223 - 250.

二级参考文献7

  • 1Zhou Z H,Wu J,Tang W. Ensembling neural networks: Many could be better than all[J].Artificial Intelligence,2002,137(1-2):239-263.
  • 2Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning[A].In: Tesauro D, Touretzky D, Leen T, eds. Advances in neural information processing systems 7[M]. Cambridge, MA: MIT Press, 1995. 231-238.
  • 3Schapire R E. The strength of weak learnability [J]. Machine Learning, 1990, 5(2): 197-227.
  • 4Breiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140.
  • 5Bryll R, Gutierrez-Osuna R, Quek F. Attribute bagging: Improving accuracy of classifier ensembles by using random feature subsets[J]. Pattern Recognition, 2003, 36(6): 1291-1302.
  • 6Kira K, Rendell L A. A practical approach to feature selection[A].In: Sleeman D, Edwards P,eds.Proceedings of the 9th International Workshop on Machine Learning[C].San Francisco, CA: Morgan Kaufmann ,1992. 249-256.
  • 7Kononenko I. Estimating attributes: Analysis and extensions of Relief [A]. In: De Raedt L, Bergadano F, eds. Proceedings of the 7th European Conference on Machine Learning[C].Berlin: Springer, 1994. 171-182.

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部