This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a c...This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.展开更多
Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a ...The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.展开更多
Liquid state methanol and ethanol under different temperatures have been investigated by FT-NIR(Fourier transform nearinfrared) spectroscopy,generalized two-dimensional(2D) correlation spectroscopy,and PCA(principal c...Liquid state methanol and ethanol under different temperatures have been investigated by FT-NIR(Fourier transform nearinfrared) spectroscopy,generalized two-dimensional(2D) correlation spectroscopy,and PCA(principal component analysis) . First,the FT-NIR spectra were measured over a temperature range of 30-64(or 30-71) °C,and then the 2D correlation spectra were computed.Combining near-infrared spectroscopy,generalized 2D correlation spectroscopy,and references,we analyzed the molecular structures(especially the hydrogen bond) of methanol and ethanol,and performed the NIR band assignments. The PCA method was employed to verify the results of the 2D analysis.This study will be helpful to the understanding of these reagents.展开更多
In view of the fact that the wavelet packet transform(WPT) can only weakly detect the occurrence of fault, this paper applies a fault diagnosis algorithm including wavelet packet transform and principal component anal...In view of the fact that the wavelet packet transform(WPT) can only weakly detect the occurrence of fault, this paper applies a fault diagnosis algorithm including wavelet packet transform and principal component analysis(PCA) to the inverter-side fault diagnosis of multi-terminal hybrid highvoltage direct current(HVDC) network, which can significantly improve the speed and accuracy of fault diagnosis. Firstly, current amplitude and current slope are used to sample the data,and the WPT is used to extract the energy spectrum of the signal. Secondly, an energy matrix is constructed, and the PCA method is used to calculate whether the squared prediction error(SPE) statistics of various signals that can reflect the degree of deviation of the measured value from the principal component model at a certain time exceed the limit to judge the occurrence of the fault. Further, its maximum value is compared to determine the fault types. Finally, based on a large number of MATLAB/Simulink simulation results, it is shown that the PCA method using the current slope as the sampled data can detect the occurrence of a ground fault with small transition resistance within 2 ms, and identify the fault types within 10 ms,without being affected by the sampling frequency.展开更多
In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems ...In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.展开更多
Concentration of elements or element groups in a geological body is the result of multiple stages of rockforming and ore-forming geological processes.An ore-forming element group can be identified by PCA(principal com...Concentration of elements or element groups in a geological body is the result of multiple stages of rockforming and ore-forming geological processes.An ore-forming element group can be identified by PCA(principal component analysis)and be separated into two components using BEMD(bi-dimensional empirical mode decomposition):(1)a high background component which represents the ore-forming background developed in rocks through various geological processes favorable for mineralization(i.e.magmatism,sedimentation and/or metamorphism);(2)the anomaly component which reflects the oreforming anomaly that is overprinted on the high background component developed during mineralization.Anomaly components are used to identify ore-finding targets more effectively than ore-forming element groups.Three steps of data analytical procedures are described in this paper;firstly,the application of PCA to establish the ore-forming element group;secondly,using BEMD on the o re-forming element group to identify the anomaly components created by different types of mineralization processes;and finally,identifying ore-finding targets based on the anomaly components.This method is applied to the Tengchong tin-polymetallic belt to delineate ore-finding targets,where four targets for Sn(W)and three targets for Pb-Zn-Ag-Fe polymetallic mineralization are identified and defined as new areas for further prospecting.It is shown that BEMD combined with PCA can be applied not only in extracting the anomaly component for delineating the ore-finding target,but also in extracting the residual component for identifying its high background zone favorable for mineralization from its oreforming element group.展开更多
The main research motive is to analysis and to veiny the inherent nonlinear character of MPEG-4 video. The power spectral density estimation of the video trafiic describes its 1/f^β and periodic characteristics.The p...The main research motive is to analysis and to veiny the inherent nonlinear character of MPEG-4 video. The power spectral density estimation of the video trafiic describes its 1/f^β and periodic characteristics.The priraeipal compohems analysis of the reconstructed space dimension shows only several principal components can be the representation of all dimensions. The correlation dimension analysis proves its fractal characteristic. To accurately compute the largest Lyapunov exponent, the video traffic is divided into many parts.So the largest Lyapunov exponent spectrum is separately calculated using the small data sets method. The largest Lyapunov exponent spectrum shows there exists abundant nonlinear chaos in MPEG-4 video traffic. The conclusion can be made that MPEG-4 video traffic have complex nonlinear be havior and can be characterized by its power spectral density,principal components, correlation dimension and the largest Lyapunov exponent besides its common statistics.展开更多
The evolution of monthly runoff is affected both by climate environment and human activities, and its characteristics play an important role in runoff prediction and simulation. In this paper, the G-P and the principa...The evolution of monthly runoff is affected both by climate environment and human activities, and its characteristics play an important role in runoff prediction and simulation. In this paper, the G-P and the principal component analysis method, which are both based on the reconstruction theory of the phase space, are used to study the chaos characteristics of the monthly runoff series at Fudedian station in Liaohe basin. The results show that the monthly runoff series have a large probability of chaos.展开更多
采用顶空固相微萃取结合全二维气相色谱-质谱(Headspace solid-phase microextraction-comprehensive two dimensional gas chromatography/mass spectrometry HS-SPME-GC×GC-MS)技术,对4种保健黄酒(黄精酒、黄米酒、藜麦酒和苦荞...采用顶空固相微萃取结合全二维气相色谱-质谱(Headspace solid-phase microextraction-comprehensive two dimensional gas chromatography/mass spectrometry HS-SPME-GC×GC-MS)技术,对4种保健黄酒(黄精酒、黄米酒、藜麦酒和苦荞酒)中挥发性物质的种类、含量分进行分析,并且通过主成分分析法很好地区分不同原料的保健黄酒,找出重要的组分差异特征,探究其风味成分。结果表明,GC×GC-MS检测到4种保健黄酒中挥发性组分156种,选取匹配度大于800的挥发性组分,4种保健黄酒中共鉴定出140种挥发性组分,其中包括酯类、醇类、醛酮类、酸类、烃类、含氮化合物、苯系芳烃及其它化合物等。该方法可以通过鉴定黄酒挥发性组分,寻找挥发性组分与黄酒品质之间的关系,为保健黄酒的生产优化提供一定的理论依据。展开更多
Motivated by the Bagging Partial Least Squares(Bagging PLS)and Principal Component Analysis(PCA)algorithms,a novel approach known as Principal Model Analysis(PMA)method is introduced in this paper.In the proposed PMA ...Motivated by the Bagging Partial Least Squares(Bagging PLS)and Principal Component Analysis(PCA)algorithms,a novel approach known as Principal Model Analysis(PMA)method is introduced in this paper.In the proposed PMA algorithm,the PCA and the Bagging PLS are combined.In this method,multiple PLS models are trained on sub-training sets,derived from the training set using the random sampling with replacement approach.The regression coefficients of all the sub-PLS models are fused in a joint regression coefficient matrix.The final projection direction is then estimated by performing the PCA on the joint regression coefficient matrix.Subsequently,the proposed PMA method is compared with other traditional dimension reduction methods,such as PLS,Bagging PLS,Linear discriminant analysis(LDA)and PLS-LDA.Experimental results on six public datasets demonstrate that our proposed method consistently outperforms other approaches in terms of classification performance and exhibits greater stability.Additionally,it is employed in the application of financial statement fraud identification.PMA and other five algorithms are utilized to financial statement fraud which concerned by the academic community,and the results indicate that the classification of PMA surpassed that of the other methods.展开更多
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
文摘This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
文摘The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.
基金supported by the Medical Scientific Research Foundation of Guangdong Province,China(B2009043)
文摘Liquid state methanol and ethanol under different temperatures have been investigated by FT-NIR(Fourier transform nearinfrared) spectroscopy,generalized two-dimensional(2D) correlation spectroscopy,and PCA(principal component analysis) . First,the FT-NIR spectra were measured over a temperature range of 30-64(or 30-71) °C,and then the 2D correlation spectra were computed.Combining near-infrared spectroscopy,generalized 2D correlation spectroscopy,and references,we analyzed the molecular structures(especially the hydrogen bond) of methanol and ethanol,and performed the NIR band assignments. The PCA method was employed to verify the results of the 2D analysis.This study will be helpful to the understanding of these reagents.
基金supported by the National Natural Science Foundation of China-State Grid Joint Fund for Smart Grid (No. U2066210)。
文摘In view of the fact that the wavelet packet transform(WPT) can only weakly detect the occurrence of fault, this paper applies a fault diagnosis algorithm including wavelet packet transform and principal component analysis(PCA) to the inverter-side fault diagnosis of multi-terminal hybrid highvoltage direct current(HVDC) network, which can significantly improve the speed and accuracy of fault diagnosis. Firstly, current amplitude and current slope are used to sample the data,and the WPT is used to extract the energy spectrum of the signal. Secondly, an energy matrix is constructed, and the PCA method is used to calculate whether the squared prediction error(SPE) statistics of various signals that can reflect the degree of deviation of the measured value from the principal component model at a certain time exceed the limit to judge the occurrence of the fault. Further, its maximum value is compared to determine the fault types. Finally, based on a large number of MATLAB/Simulink simulation results, it is shown that the PCA method using the current slope as the sampled data can detect the occurrence of a ground fault with small transition resistance within 2 ms, and identify the fault types within 10 ms,without being affected by the sampling frequency.
基金supported by National High Technology Research and Development Program of China (863 Program)(No. 2009AA04Z162)National Nature Science Foundation of China(No. 60825302, No. 60934007, No. 61074061)+1 种基金Program of Shanghai Subject Chief Scientist,"Shu Guang" project supported by Shang-hai Municipal Education Commission and Shanghai Education Development FoundationKey Project of Shanghai Science and Technology Commission, China (No. 10JC1403400)
文摘In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.
基金funded by the Na-tional Natural Science Foundation of China(Grant Nos.41672329,41272365)the National Key Research and Development Project of China(Grant No.2016YFC0600509)the Project of China Geological Survey(Grant No.1212011120341)
文摘Concentration of elements or element groups in a geological body is the result of multiple stages of rockforming and ore-forming geological processes.An ore-forming element group can be identified by PCA(principal component analysis)and be separated into two components using BEMD(bi-dimensional empirical mode decomposition):(1)a high background component which represents the ore-forming background developed in rocks through various geological processes favorable for mineralization(i.e.magmatism,sedimentation and/or metamorphism);(2)the anomaly component which reflects the oreforming anomaly that is overprinted on the high background component developed during mineralization.Anomaly components are used to identify ore-finding targets more effectively than ore-forming element groups.Three steps of data analytical procedures are described in this paper;firstly,the application of PCA to establish the ore-forming element group;secondly,using BEMD on the o re-forming element group to identify the anomaly components created by different types of mineralization processes;and finally,identifying ore-finding targets based on the anomaly components.This method is applied to the Tengchong tin-polymetallic belt to delineate ore-finding targets,where four targets for Sn(W)and three targets for Pb-Zn-Ag-Fe polymetallic mineralization are identified and defined as new areas for further prospecting.It is shown that BEMD combined with PCA can be applied not only in extracting the anomaly component for delineating the ore-finding target,but also in extracting the residual component for identifying its high background zone favorable for mineralization from its oreforming element group.
基金Supported by the National Natural Science Founda-tion of China (60132030)
文摘The main research motive is to analysis and to veiny the inherent nonlinear character of MPEG-4 video. The power spectral density estimation of the video trafiic describes its 1/f^β and periodic characteristics.The priraeipal compohems analysis of the reconstructed space dimension shows only several principal components can be the representation of all dimensions. The correlation dimension analysis proves its fractal characteristic. To accurately compute the largest Lyapunov exponent, the video traffic is divided into many parts.So the largest Lyapunov exponent spectrum is separately calculated using the small data sets method. The largest Lyapunov exponent spectrum shows there exists abundant nonlinear chaos in MPEG-4 video traffic. The conclusion can be made that MPEG-4 video traffic have complex nonlinear be havior and can be characterized by its power spectral density,principal components, correlation dimension and the largest Lyapunov exponent besides its common statistics.
文摘The evolution of monthly runoff is affected both by climate environment and human activities, and its characteristics play an important role in runoff prediction and simulation. In this paper, the G-P and the principal component analysis method, which are both based on the reconstruction theory of the phase space, are used to study the chaos characteristics of the monthly runoff series at Fudedian station in Liaohe basin. The results show that the monthly runoff series have a large probability of chaos.
文摘采用顶空固相微萃取结合全二维气相色谱-质谱(Headspace solid-phase microextraction-comprehensive two dimensional gas chromatography/mass spectrometry HS-SPME-GC×GC-MS)技术,对4种保健黄酒(黄精酒、黄米酒、藜麦酒和苦荞酒)中挥发性物质的种类、含量分进行分析,并且通过主成分分析法很好地区分不同原料的保健黄酒,找出重要的组分差异特征,探究其风味成分。结果表明,GC×GC-MS检测到4种保健黄酒中挥发性组分156种,选取匹配度大于800的挥发性组分,4种保健黄酒中共鉴定出140种挥发性组分,其中包括酯类、醇类、醛酮类、酸类、烃类、含氮化合物、苯系芳烃及其它化合物等。该方法可以通过鉴定黄酒挥发性组分,寻找挥发性组分与黄酒品质之间的关系,为保健黄酒的生产优化提供一定的理论依据。
基金Supported by the Beijing Municipal Social Science Foundation(SZ202210005004)Beijing Natural Science Foundation(9242004)。
文摘Motivated by the Bagging Partial Least Squares(Bagging PLS)and Principal Component Analysis(PCA)algorithms,a novel approach known as Principal Model Analysis(PMA)method is introduced in this paper.In the proposed PMA algorithm,the PCA and the Bagging PLS are combined.In this method,multiple PLS models are trained on sub-training sets,derived from the training set using the random sampling with replacement approach.The regression coefficients of all the sub-PLS models are fused in a joint regression coefficient matrix.The final projection direction is then estimated by performing the PCA on the joint regression coefficient matrix.Subsequently,the proposed PMA method is compared with other traditional dimension reduction methods,such as PLS,Bagging PLS,Linear discriminant analysis(LDA)and PLS-LDA.Experimental results on six public datasets demonstrate that our proposed method consistently outperforms other approaches in terms of classification performance and exhibits greater stability.Additionally,it is employed in the application of financial statement fraud identification.PMA and other five algorithms are utilized to financial statement fraud which concerned by the academic community,and the results indicate that the classification of PMA surpassed that of the other methods.