期刊文献+

面向不平衡数据的多层神经网络模型

Multilayer neural network model for unbalanced data
下载PDF
导出
摘要 传统的不平衡数据分类问题往往会因为类间数据不平衡造成分类器的性能下降。利用AUC(ROC曲线下的面积)为评价指标,结合单类F-score特征选择和遗传算法建立多层神经网络模型,选出对于不平衡数据分类更有利的特征子集,从而建立更适用于不平衡数据分类的深度模型。基于Tensor Flow建立多层神经网络模型,通过对4组不同UCI数据集进行测试,并与传统的机器学习算法如朴素贝叶斯、K最近邻、神经网络等进行对比验证。实验证明,所提模型在处理不平衡数据分类问题上的表现更优秀。 Classification of unbalanced data often has low performance of the classifier because of the unbalance of data between classes. Using AUC(the area under the ROC curve) as evaluation index, combined with one class F-score feature selection and genetic algorithm, a multilayer neural network model was established, and a more favorable feature set for unbalanced data classification was selected, so as to establish a deeper model suitable for classification of unbalanced data. Based on Tensor Flow, a multilayer neural network model was established. Using four different UCI datasets for testing, and comparing with the traditional machine learning algorithms such as Naive Bayesian, KNN, neural networks, etc, the performance of the proposed model built on the unbalanced data classification is more excellent.
出处 《物联网学报》 2018年第2期65-72,共8页 Chinese Journal on Internet of Things
基金 国家重点研发计划基金资助项目(No.2016YFC0901303)~~
关键词 不平衡数据 单类F-score特征选择 遗传算法 多层神经网络 unbalanced data one class F-score feature selection genetic algorithm multilayer neural network
  • 相关文献

参考文献2

二级参考文献29

  • 1EZAWA K J, SINGH M, NORTON S W. Learning goal oriented Bayesian networks for telecommunications management [ C ]//Proc of the 13th International Conference on Machine Learning. San Fransisco: Morgan Kaufmann, 1996:139-147.
  • 2CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE:synthetic minority over-sampling technique[ J ]. Journal of Artificial Intelligence Research, 2002,16:321-357.
  • 3KUBAT M, HOLTE R, MATWIN S. Machine learning for the detection of oil spills in satellite radar images [ J ]. Machine Learning, 1998,30(2) :195-215.
  • 4BOSCH A T, HERIK H J, DAELEMANS W. When small disjuncts abound, try lazy learning: a case study[ C ]//Proc of the 7th Belgian- Dutch Conference on Machine Learning. 1997 : 109-118.
  • 5ZHENG Zhao-hui, WU Xiao-yun, SRIHARI R. Feature selection for text categorization on imbalanced data[ J ]. SIGKDD Explorations, 2004,6( 1 ) :80-89.
  • 6FAWCETT T, PROVOST F. Combining data mining and machine learning for effective user profile [ C ]//Proc of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland: AAAI Press, 1996:8-13.
  • 7JAPKOWICZ N. Learning form imbalanced data sets : a comparison of various strategies, WS-00-05 [ R]. Menlo Park: AAAI Press, 2000.
  • 8CHAWLA N V, JAPKOWICZ N, KOLCZ A. Proceedings of the ICML workshop on learning from imbalanced data sets[ C]. 2003.
  • 9CHAWLA N V, JAPKOWICZ N, KOLCZ A. Editorial: special issue on learning from imbalanced data sets[J]. ACM SIGKDD Exploration Newsletter, 2004,6( 1 ) : 1-6.
  • 10BRADLEY A. The use of the area under the ROC curve in the evaluation of machine learning algorithms [ J ]. Pattern Recognition, 1997,30(6) : 1145-1159.

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部