Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
提出通过String Kernel方法把负实例语法数据库中的负实例转化成核矩阵,再用Kernel Principal Component Analysis(KPCA)对转换的核矩阵进行特征提取,进而可将原始负实例数据库按照这些特征分成多个容量较小的特征表。通过构造负实例特...提出通过String Kernel方法把负实例语法数据库中的负实例转化成核矩阵,再用Kernel Principal Component Analysis(KPCA)对转换的核矩阵进行特征提取,进而可将原始负实例数据库按照这些特征分成多个容量较小的特征表。通过构造负实例特征索引表设计了一个分类器,待检查的句子通过此分类器被分配到某个负实例特征表里进行匹配搜索,而此特征表的特征属性数和记录数要远远小于原始负实例数据库中的相应数目,从而大大提高了检查的速度,同时不影响语法检查的精度。通过比较测试,可看出提出的方法在保证语法检查精确度的同时有更快的速度。展开更多
旋转机械的剩余使用寿命(remaining useful life, RUL)预测对工业设备预测和健康管理的具有重要意义。该文针对多传感器冗余数据导致旋转机械退化信息提取困难、剩余使用寿命预测效果差的问题,提出了一种基于核主成分分析-长短期记忆网...旋转机械的剩余使用寿命(remaining useful life, RUL)预测对工业设备预测和健康管理的具有重要意义。该文针对多传感器冗余数据导致旋转机械退化信息提取困难、剩余使用寿命预测效果差的问题,提出了一种基于核主成分分析-长短期记忆网络(kernel principal component analysis-long short term memory, KPCA-LSTM)的方法对旋转机械剩余使用寿命预测。首先,分析旋转机械的多维退化数据,选择可以表征旋转机械退化的数据;其次,对退化数据进行(kernel principal component analysis, KPCA)融合及特征提取,将降维融合的特征作为预测模型的输入;然后构建旋转机械的健康指标,并通过多阶微分划分旋转机械的不同健康状态,建立KPCA-LSTM模型对旋转机械的剩余使用寿命进行预测;最后,在实验室搭建的矿用减速器平台上进行了试验验证。试验结果表明:该文所提方法与LSTM、粒子群优化LSTM的方法比较,该方法预测效果优于其他两种模型,并降低模型训练的复杂性,减少预测用时。展开更多
水泥生产过程中,分解炉出口温度是非常重要的工艺参数,为了应对出口温度变量的多样性,文章提出一种核主成分分析(kernel principal component analysis,KPCA)与双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络相...水泥生产过程中,分解炉出口温度是非常重要的工艺参数,为了应对出口温度变量的多样性,文章提出一种核主成分分析(kernel principal component analysis,KPCA)与双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络相结合的温度预测组合模型用来预测分解炉的出口温度。通过KPCA筛选出影响因素的主成分从而达到数据降维目的,将降维后的主成分作为BiLSTM神经网络的输入,分解炉出口温度作为BiLSTM神经网络的输出。经BiLSTM神经网络训练,得到分解炉出口温度预测模型。通过对比验证表明,使用KPCA-BiLSTM相结合的温度预测模型具有较好的预测精度。展开更多
Serial Analysis of Gene Expression (SAGE) is a powerful tool to analyze whole-genome expression profiles. SAGE data, characterized by large quantity and high dimensions, need reducing their dimensions and extract feat...Serial Analysis of Gene Expression (SAGE) is a powerful tool to analyze whole-genome expression profiles. SAGE data, characterized by large quantity and high dimensions, need reducing their dimensions and extract feature to improve the accuracy and efficiency when they are used for pattern recognition and clustering analysis. A Poisson Model-based Kernel (PMK) was proposed based on the Poisson distribution of the SAGE data. Kernel Principle Component Analysis (KPCA) with PMK was proposed and used in feature-extract analysis of mouse retinal SAGE data. The computa-tional results show that this algorithm can extract feature effectively and reduce dimensions of SAGE data.展开更多
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
文摘提出通过String Kernel方法把负实例语法数据库中的负实例转化成核矩阵,再用Kernel Principal Component Analysis(KPCA)对转换的核矩阵进行特征提取,进而可将原始负实例数据库按照这些特征分成多个容量较小的特征表。通过构造负实例特征索引表设计了一个分类器,待检查的句子通过此分类器被分配到某个负实例特征表里进行匹配搜索,而此特征表的特征属性数和记录数要远远小于原始负实例数据库中的相应数目,从而大大提高了检查的速度,同时不影响语法检查的精度。通过比较测试,可看出提出的方法在保证语法检查精确度的同时有更快的速度。
文摘旋转机械的剩余使用寿命(remaining useful life, RUL)预测对工业设备预测和健康管理的具有重要意义。该文针对多传感器冗余数据导致旋转机械退化信息提取困难、剩余使用寿命预测效果差的问题,提出了一种基于核主成分分析-长短期记忆网络(kernel principal component analysis-long short term memory, KPCA-LSTM)的方法对旋转机械剩余使用寿命预测。首先,分析旋转机械的多维退化数据,选择可以表征旋转机械退化的数据;其次,对退化数据进行(kernel principal component analysis, KPCA)融合及特征提取,将降维融合的特征作为预测模型的输入;然后构建旋转机械的健康指标,并通过多阶微分划分旋转机械的不同健康状态,建立KPCA-LSTM模型对旋转机械的剩余使用寿命进行预测;最后,在实验室搭建的矿用减速器平台上进行了试验验证。试验结果表明:该文所提方法与LSTM、粒子群优化LSTM的方法比较,该方法预测效果优于其他两种模型,并降低模型训练的复杂性,减少预测用时。
文摘水泥生产过程中,分解炉出口温度是非常重要的工艺参数,为了应对出口温度变量的多样性,文章提出一种核主成分分析(kernel principal component analysis,KPCA)与双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络相结合的温度预测组合模型用来预测分解炉的出口温度。通过KPCA筛选出影响因素的主成分从而达到数据降维目的,将降维后的主成分作为BiLSTM神经网络的输入,分解炉出口温度作为BiLSTM神经网络的输出。经BiLSTM神经网络训练,得到分解炉出口温度预测模型。通过对比验证表明,使用KPCA-BiLSTM相结合的温度预测模型具有较好的预测精度。
基金Supported by the National Natural Science Foundation of China (No. 50877004)
文摘Serial Analysis of Gene Expression (SAGE) is a powerful tool to analyze whole-genome expression profiles. SAGE data, characterized by large quantity and high dimensions, need reducing their dimensions and extract feature to improve the accuracy and efficiency when they are used for pattern recognition and clustering analysis. A Poisson Model-based Kernel (PMK) was proposed based on the Poisson distribution of the SAGE data. Kernel Principle Component Analysis (KPCA) with PMK was proposed and used in feature-extract analysis of mouse retinal SAGE data. The computa-tional results show that this algorithm can extract feature effectively and reduce dimensions of SAGE data.