In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In th...With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.展开更多
Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA'...The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA's kernel parameters for improving its feature dimension reduction result. In this paper, a fitness function was established by use of the ideal of Fisher discrimination function firstly. Then the global optimal solution of fitness function was searched by particle swarm optimization( PSO) algorithm and a multi-state information dimension reduction algorithm based on PSO-KICA was established. Finally,the validity of this algorithm to enhance the precision of feature dimension reduction has been proven.展开更多
An improved face recognition method is proposed based on principal component analysis (PCA) compounded with genetic algorithm (GA), named as genetic based principal component analysis (GPCA). Initially the eigen...An improved face recognition method is proposed based on principal component analysis (PCA) compounded with genetic algorithm (GA), named as genetic based principal component analysis (GPCA). Initially the eigenspace is created with eigenvalues and eigenvectors. From this space, the eigenfaces are constructed, and the most relevant eigenfaees have been selected using GPCA. With these eigenfaees, the input images are classified based on Euclidian distance. The proposed method was tested on ORL (Olivetti Research Labs) face database. Experimental results on this database demonstrate that the effectiveness of the proposed method for face recognition has less misclassification in comparison with previous methods.展开更多
This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dim...Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC. A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently, and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC. Furthermore, a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines. Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically, but also improves the identify rates effectively.展开更多
The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a ...The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.展开更多
In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems ...In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.展开更多
An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow ...An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.展开更多
Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle com...Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle component analysis (KPCA). The results show that the best error rate in handwritten digit recognition by kernel factor analysis with vadmax (4.2%) was superior to KPCA (4.4%). The KFA with varimax could more accurately image handwritten digit recognition.展开更多
The problems in equipment fault detection include data dimension explosion,computational complexity,low detection accuracy,etc.To solve these problems,a device anomaly detection algorithm based on enhanced long short-...The problems in equipment fault detection include data dimension explosion,computational complexity,low detection accuracy,etc.To solve these problems,a device anomaly detection algorithm based on enhanced long short-term memory(LSTM)is proposed.The algorithm first reduces the dimensionality of the device sensor data by principal component analysis(PCA),extracts the strongly correlated variable data among the multidimensional sensor data with the lowest possible information loss,and then uses the enhanced stacked LSTM to predict the extracted temporal data,thus improving the accuracy of anomaly detection.To improve the efficiency of the anomaly detection,a genetic algorithm(GA)is used to adjust the magnitude of the enhancements made by the LSTM model.The validation of the actual data from the pumps shows that the algorithm has significantly improved the recall rate and the detection speed of device anomaly detection,with the recall rate of 97.07%,which indicates that the algorithm is effective and efficient for device anomaly detection in the actual production environment.展开更多
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
文摘With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
文摘The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA's kernel parameters for improving its feature dimension reduction result. In this paper, a fitness function was established by use of the ideal of Fisher discrimination function firstly. Then the global optimal solution of fitness function was searched by particle swarm optimization( PSO) algorithm and a multi-state information dimension reduction algorithm based on PSO-KICA was established. Finally,the validity of this algorithm to enhance the precision of feature dimension reduction has been proven.
文摘An improved face recognition method is proposed based on principal component analysis (PCA) compounded with genetic algorithm (GA), named as genetic based principal component analysis (GPCA). Initially the eigenspace is created with eigenvalues and eigenvectors. From this space, the eigenfaces are constructed, and the most relevant eigenfaees have been selected using GPCA. With these eigenfaees, the input images are classified based on Euclidian distance. The proposed method was tested on ORL (Olivetti Research Labs) face database. Experimental results on this database demonstrate that the effectiveness of the proposed method for face recognition has less misclassification in comparison with previous methods.
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
基金the National Natural Science of China (50675167)a Foundation for the Author of National Excellent Doctoral Dissertation of China(200535)
文摘Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC. A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently, and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC. Furthermore, a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines. Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically, but also improves the identify rates effectively.
文摘The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.
基金supported by National High Technology Research and Development Program of China (863 Program)(No. 2009AA04Z162)National Nature Science Foundation of China(No. 60825302, No. 60934007, No. 61074061)+1 种基金Program of Shanghai Subject Chief Scientist,"Shu Guang" project supported by Shang-hai Municipal Education Commission and Shanghai Education Development FoundationKey Project of Shanghai Science and Technology Commission, China (No. 10JC1403400)
文摘In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.
基金Project supported by the National Natural Science Foundation of China(Nos.50936005,51576182,and 11172296)
文摘An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.
基金The National Defence Foundation of China (No.NEWL51435Qt220401)
文摘Kernal factor analysis (KFA) with vafimax was proposed by using Mercer kernel function which can map the data in the original space to a high-dimensional feature space, and was compared with the kernel principle component analysis (KPCA). The results show that the best error rate in handwritten digit recognition by kernel factor analysis with vadmax (4.2%) was superior to KPCA (4.4%). The KFA with varimax could more accurately image handwritten digit recognition.
基金National Key R&D Program of China(No.2020YFB1707700)。
文摘The problems in equipment fault detection include data dimension explosion,computational complexity,low detection accuracy,etc.To solve these problems,a device anomaly detection algorithm based on enhanced long short-term memory(LSTM)is proposed.The algorithm first reduces the dimensionality of the device sensor data by principal component analysis(PCA),extracts the strongly correlated variable data among the multidimensional sensor data with the lowest possible information loss,and then uses the enhanced stacked LSTM to predict the extracted temporal data,thus improving the accuracy of anomaly detection.To improve the efficiency of the anomaly detection,a genetic algorithm(GA)is used to adjust the magnitude of the enhancements made by the LSTM model.The validation of the actual data from the pumps shows that the algorithm has significantly improved the recall rate and the detection speed of device anomaly detection,with the recall rate of 97.07%,which indicates that the algorithm is effective and efficient for device anomaly detection in the actual production environment.