期刊文献+

SVM训练数据归一化研究 被引量:56

RESEARCH ON DATA NORMALIZATION FOR SVM TRAINING
下载PDF
导出
摘要 数据归一化是训练支持向量机(SVM)必须的数据预处理过程.常用的归一化方法有[-1,+1]、N(0,1)等方法,但现有文献尚未发现关于这些常用归一化方法科学依据方面的研究.本文以经验性的实验对数据归一化的理由、归一化与不归一化对训练效率和模型预测能力影响等方面开展研究.论文选择标准数据集,对原始未归一化、不同方法归一化、人工逆归一化、任选数据属性列等情况下的数据分别进行了SVM训练,并记录目标函数值随迭代次数的变化、训练时间、模型测试及k-CV性能等信息.实验结果表明,将数据值限制在常规范围内的归一化方法,如[-0.5,+0.5]-[-5,+5]、N(0,1)-N(0,5)等均能在训练时间最短的情况下获得最佳的预测模型.本文工作为SVM以及一般机器学习算法的数据归一化提供了科学依据. Data normalization is a necessary training support vector machine (SVM) to the process of data preprocessing. The normalization method commonly used contains [-1, + 1 ], N (0,1), etc. However, the existing literature has not yet been found on the research of these commonly used normalization methods of scientific basis. This paper carries out research based on empirical experiments on data normalization, training efficiency and model prediction effect of normalization and non-normalization, etc. Standard data set being selected, this paper analyzed the original non-normalized data, data normalized by different method, artificial inverse normalization and optional attribute of the data by SVM training, recorded changes of objective function values with the number of iterations, training time, model test and k- CV performance information, etc. The experimental results show that the normalization method of limiting the data in the conventional range, such as [-0.5, +0.5] to [-5, +5], N (0, 1) - N (0,5) can obtain the best predictive model in the ease of short training time. This paper provides a scientific basis for the normalization of SVM data and learning algorithm of general machine.
出处 《山东师范大学学报(自然科学版)》 CAS 2016年第4期60-65,共6页 Journal of Shandong Normal University(Natural Science)
关键词 支持向量机 数据归一化 数据预处理 交叉验证 SVM SMO data normalization data pre-processing cross validation
  • 相关文献

参考文献4

二级参考文献28

  • 1张琨,许满武,刘凤玉,张宏.基于支持向量机的异常入侵检测系统[J].计算机工程,2004,30(18):43-45. 被引量:7
  • 2李昆仑,黄厚宽,田盛丰,刘振鹏,刘志强.模糊多类支持向量机及其在入侵检测中的应用[J].计算机学报,2005,28(2):274-280. 被引量:49
  • 3汪晓妍,傅德胜.生物特征识别中的信息融合技术[J].微计算机信息,2005,21(10S):148-153. 被引量:17
  • 4A. Ross, K. Nandakumar and A. Jain. Handbook of Multibiometrics. Springer-Verlag New York Inc, 2006.
  • 5A. Jain, A. Ross and K. Nandakumar. Score normalization in multimodal biometric systems [J]. Pattern Recognition, 38 (2005) 2270-2285.
  • 6http://bias.csr.unibo.it/fvc2002/
  • 7K. Messer, J. Matas, J. Kittler, et al. XM2VTSDB: the extended M2VTS database. Proceeding of Audio and Video-based Biometric Person Authentication[C]. WashingtonDC, USA, pp.72-77, 1999.
  • 8Sanderson C. and Paliwal K. Information Fusion and Person Verification using Speech and Face Information [C]. IDIAP-RR, 02-33,2002
  • 9Dong J X,Krzyzak A,Suen C Y.A fast,parallel optimization for training support vector machine[C]//Proceedings of 3rd International Conference on Machine Learning and Data Mining,2003:96-105.
  • 10Zanni L,Serafini T,Zanghirati G.Parallel software for training large scale support vector machines on multiprocessor systems[J].The Journal of Machine Learning Research,2006,7:1467-1492.

共引文献12

同被引文献505

引证文献56

二级引证文献149

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部