摘要
特征提取作为分类、聚类等数据分析问题的关键步骤,对结果产生重要的影响,常用的主成分分析方法作为线性方法,难以提取非线性特征,引入核函数后形成的核主成分分析方法,使用了欧式距离作为相似性度量,有时难以有效提取数据的本质特征。本文利用测地距离代替欧式距离,形成了基于测地距离的核主成分分析方法,利用仿真和TE生产过程数据进行验证表明具有更好的特征提取能力。
Feature extraction,which is a key step in data analysis such as classification,clustering and so on,has an important impact on the results. Principal component analysis is a simple linear transformation technique and can not build the non-linear relationship among data. The kernel principal component analysis is proposed based on kernel function. In some cases euclidean distance as the similarity measure can not extract the essential feature of the data. In the paper the geodesic distance is introduced as the similarity measure in kernel principal component analysis. Simulation data and Tennessee Eastman process data are used for model validation,as a result the proposed method has better performance on feature extraction,compared with the traditional kernel principal component analysis.
出处
《微计算机信息》
2010年第31期123-124,109,共3页
Control & Automation
关键词
测地距离
核主成分分析
特征提取
数据分析
Geodesic Distance
Kernel Principal Component Analysis
Feature Extraction
Data Analysis