期刊文献+

基于互信息的主成分分析特征选择算法 被引量:105

PCA based on mutual information for feature selection
原文传递
导出
摘要 主成分分析是一种常用的特征选择算法,经典方法是计算各个特征之间的相关,但是相关无法评估变量间的非线性关系.互信息可用于衡量两个变量间相互依赖的强弱程度,且不局限于线性相关,鉴于此,提出一种基于互信息的主成分分析特征选择算法.该算法计算特征间的互信息,以互信息矩阵的特征值作为评价准则确定主成分的个数,并衡量主成分分析特征选择的效果.通过实例对所提出方法和传统主成分分析方法进行比较,并以神经网络为分类器分析分类效果. @@@@Principal component analysis(PCA) is a common method for feature selection. The classical procedure to obtain principal components is calculating the correlation matrix between features. However, the correlation cannot reflect the nonlinear relationship. Mutual information measures the interdependence strength between variables which are not limited to the linear correlation. PCA based on mutual information(MIPCA) for feature selection is presented. The algorithm calculates the mutual information matrix and extracts the eigenvalues as the criteria to determine the number of principal components and assess the effect of feature selection. Finally, the proposed algorithm is compared with PCA by cases, and the efficiency of classification is tested by neuron network.
出处 《控制与决策》 EI CSCD 北大核心 2013年第6期915-919,共5页 Control and Decision
基金 国家自然科学基金青年科学基金项目(11104316) 上海市自然科学基金项目(11ZR1446000)
关键词 互信息 主成分分析 特征选择 mutual information principal component analysis feature selection
  • 引文网络
  • 相关文献

参考文献14

  • 1Witten I,Frank E. Data mining: Practical machinelearning tools and techniques[M]. San Francisco: MorganKaufmann,2005: 39-52.
  • 2Bishop C. Pattern recognition and machine leaming[M].New York: Springer, 2006: 1-58.
  • 3Russell S, Norvig P. Artificial intelligence: A modemapproach[M]. New Jersey: Prentice Hall, 2010: 31-44.
  • 4Bellman R. Adaptive control processes: A guided tour[M].Princeton: Princeton University Press, 1966: 152-175.
  • 5Donald E. The art of computer programming [J]. Sortingand Searching, 1999, 3: 426-458.
  • 6Goldberg D. Genetic algorithms in search, optimizationand machine leaming[M]. New York: Addison-wesley,1989: 41.
  • 7Jolliffe I. Principal component analysis[M]. New York:Springer-Verlag, 1986: 10-28.
  • 8Kwon O, Lee T. Phoneme recognition using ICA-basedfeature extraction and transformation[J]. Signal Processing,2004, 84(6): 1005-1019.
  • 9Shannon C. A mathematical theory of communication[J].ACM Sigmobile Mobile Computing and CommunicationsReview, 2001, 5(1): 3-55.
  • 10Battiti R. Using mutual information for selecting featuresin supervised neural net learning [J], IEEE Trans on NeuralNetworks, 1994,5(4): 537-550.

二级参考文献6

  • 1Yang Yiming,Pedersen J O.A comparative study on feature selection in text categorization[C]//Proc of the 14th International Conference on Machine Learning ICML97,1997:412-420.
  • 2Karypis G,Han E.Fast supervised dimensionality reduction algorithm with applications to document categorization and retrieval[C]// Proc of the 9th ACM International Conference on Information and Knowledge Management CIKM-00.New York,US:ACM Press,2000: 228-233.
  • 3Baker L D,McCallum A K.Distributional clustering of words for text classification[C]//Proc of the 21st Annual International ACM SIGIR, 1998 :96-103.
  • 4谭松波语料库[DB/OL].http://lcc.software.ict.ac.cn/-tansongbo/corpusl.php.
  • 5Jolliffe I T.Principal component analysis[M].New York:Spriger Verlag, 1986.
  • 6Martinez A M,Kak A C.PCA versus LDA[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(2):228-233.

共引文献34

同被引文献964

引证文献105

二级引证文献577

;
使用帮助 返回顶部