We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a...We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a nonlinear flow towards a lower-dimensional subspace;the projection onto the subspace gives the low-dimensional embedding.Training the model involves identifying the nonlinear flow and the subspace.Following the equation discovery method,we represent the vector field that defines the flow using a linear combination of dictionary elements,where each element is a pre-specified linear/nonlinear candidate function.A regularization term for the average total kinetic energy is also introduced and motivated by the optimal transport theory.We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method.We also show how the DDR method can be trained using a gradient-based optimization method,where the gradients are computed using the adjoint method from the optimal control theory.The DDR method is implemented and compared on synthetic and example data sets to other dimension reduction methods,including the PCA,t-SNE,and Umap.展开更多
Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and ...Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.展开更多
Principal component analysis and generalized low rank approximation of matrices are two different dimensionality reduction methods. Two different dimensionality reduction algorithms are applied to the L1-CSVM model ba...Principal component analysis and generalized low rank approximation of matrices are two different dimensionality reduction methods. Two different dimensionality reduction algorithms are applied to the L1-CSVM model based on augmented Lagrange method to explore the variation of running time and accuracy of the model in dimensionality reduction space. The results show that the improved algorithm can greatly reduce the running time and improve the accuracy of the algorithm.展开更多
In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet ...In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet the intended requirements within a specific frequency band, considering the uncertainty of structural parameters and frequency-variant electromagnetic parameters.And then a frequency domain reliability analysis method based on univariate dimension reduction method is proposed, which provides an effective calculation tool for electromagnetic frequency domain reliability. In electromagnetic problems, performance indicators usually vary with frequency. The method firstly discretizes the frequency-variant performance indicator function into a series of frequency points' functions, and then transforms the frequency domain reliability problem into a series system reliability problem of discrete frequency points' functions. Secondly, the univariate dimension reduction method is introduced to solve the probability distribution functions and correlation coefficients of discrete frequency points' functions in the system. Finally, according to the above calculation results, the series system reliability can be solved to obtain the frequency domain reliability, and the cumulative distribution function of the performance indicator can also be obtained. In this study,Monte Carlo simulation is adopted to demonstrate the validity of the frequency domain reliability analysis method. Three examples are investigated to demonstrate the accuracy and efficiency of the proposed method.展开更多
The transient proper orthogonal decomposition(TPOD) method is used to study dynamic behaviors of the reduced rotor-bearing models,and the fault-free model is compared with the models with looseness fault.A 22 degree o...The transient proper orthogonal decomposition(TPOD) method is used to study dynamic behaviors of the reduced rotor-bearing models,and the fault-free model is compared with the models with looseness fault.A 22 degree of freedoms(DOFs) rotor model supported by bearings is established.Both one end and two ends pedestal looseness of the liquid-film bearings are studied by analyzing the time history and the frequency-spectrum curves.The effects of the initial displacement and velocity values to frequency components of the original systems and the dimension reduction efficiency are discussed.Moreover,the effects of variation of initial conditions on the efficiency of the TPOD method are studied.Reduced models can provide guidance significance from the perspectives of the theory and numerical simplification to discuss the characteristics of pedestal looseness fault.展开更多
The feature-selection problem in training AdaBoost classifiers is addressed in this paper. A working feature subset is generated by adopting a novel feature subset selection method based on the partial least square (...The feature-selection problem in training AdaBoost classifiers is addressed in this paper. A working feature subset is generated by adopting a novel feature subset selection method based on the partial least square (PLS) regression, and then trained and selected from this feature subset in Boosting. The experiments show that the proposed PLS-based feature-selection method outperforms the current feature ranking method and the random sampling method.展开更多
Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning m...Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigen- matrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.展开更多
Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-outp...Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method.展开更多
This paper explores the realization of robotic motion planning, especially Findpath problem, which is a basic motion planning problem that arises in the development of robotics. Findpath means: Give the initial and de...This paper explores the realization of robotic motion planning, especially Findpath problem, which is a basic motion planning problem that arises in the development of robotics. Findpath means: Give the initial and desired final configurations of a robotic arm in 3-dimensionnl space, and give descriptions of the obstacles in the space, determine whether there is a continuous collision-free motion of the robotic arm from one configure- tion to the other and find such a motion if it exists. There are several branches of approach in motion planning area, but in reality the important things are feasibility, efficiency and accuracy of the method. In this paper ac- cording to the concepts of Configuration Space (C-Space) and Rotation Mapping Graph (RMG) discussed in [1], a topological method named Dimension Reduction Method (DRM) for investigating the connectivity of the RMG (or the topologic structure of the RMG )is presented by using topologic technique. Based on this ap- proach the Findpath problem is thus transformed to that of finding a connected way in a finite Characteristic Network (CN). The method has shown great potentiality in practice. Here a simulation system is designed to embody DRM and it is in sight that DRM can he adopted in the first overall planning of real robot sys- tem in the near future.展开更多
In recent years,the network continues to enter people’s lives,followed by network security issues that continue to appear,causing substantial economic losses to the world.As an effective method to tackle the network ...In recent years,the network continues to enter people’s lives,followed by network security issues that continue to appear,causing substantial economic losses to the world.As an effective method to tackle the network security issues,intrusion detection system has been widely used and studied.In this paper,the NSL-KDD data set is used to reduce the dimension of data features,remove the features of low correlation and high interference,and improve the computational efficiency.To improve the detection rate and accuracy of intrusion detection,this paper introduces the particle method for the first time that we call it intrusion detection with particle(IDP).To illustrate the effectiveness of this method,experiments are carried out on three kinds of data-before dimension reduction,after dimension reduction and importing particle method based on dimension reduction.By comparing the results of DT,NN,SVM,K-NN,and NB,it is proved that the particle method can effectively improve the intrusion detection rate.展开更多
The concept of approximate inertial manifold (AIM) is extended to develop a kind of nonlinear order reduction technique for non-autonomous nonlinear systems in second-order form in this paper.Using the modal transform...The concept of approximate inertial manifold (AIM) is extended to develop a kind of nonlinear order reduction technique for non-autonomous nonlinear systems in second-order form in this paper.Using the modal transformation,a large nonlinear dynamical system is split into a 'master' subsystem,a 'slave' subsystem,and a 'negligible' subsystem.Accordingly,a novel order reduction method (Method I) is developed to construct a low order subsystem by neglecting the 'negligible' subsystem and slaving the 'slave' subsystem into the 'master' subsystem using the extended AIM.As a comparison,Method II accounting for the effects of both 'slave' subsystem and the 'negligible' subsystem is also applied to obtain the reduced order subsystem.Then,a typical 5-degree-of-freedom nonlinear dynamical system is given to compare the accuracy and efficiency of the traditional Galerkin truncation (ignoring the contributions of the slave and negligible subsystems),Method I and Method II.It is shown that Method I gives a considerable increase in accuracy for little computational cost in comparison with the standard Galerkin method,and produces almost the same accuracy as Method II.Finally,a 3-degree-of-freedom nonlinear dynamical system is analyzed by using the analytic method for showing predominance and convenience of Method I to obtain the analytically reduced order system.展开更多
文摘We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a nonlinear flow towards a lower-dimensional subspace;the projection onto the subspace gives the low-dimensional embedding.Training the model involves identifying the nonlinear flow and the subspace.Following the equation discovery method,we represent the vector field that defines the flow using a linear combination of dictionary elements,where each element is a pre-specified linear/nonlinear candidate function.A regularization term for the average total kinetic energy is also introduced and motivated by the optimal transport theory.We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method.We also show how the DDR method can be trained using a gradient-based optimization method,where the gradients are computed using the adjoint method from the optimal control theory.The DDR method is implemented and compared on synthetic and example data sets to other dimension reduction methods,including the PCA,t-SNE,and Umap.
文摘Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.
文摘Principal component analysis and generalized low rank approximation of matrices are two different dimensionality reduction methods. Two different dimensionality reduction algorithms are applied to the L1-CSVM model based on augmented Lagrange method to explore the variation of running time and accuracy of the model in dimensionality reduction space. The results show that the improved algorithm can greatly reduce the running time and improve the accuracy of the algorithm.
基金supported by the National Natural Science Foundation of China(Grant No.51490662)the National Science Fund for Distinguished Young Scholars(Grant No.51725502)
文摘In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet the intended requirements within a specific frequency band, considering the uncertainty of structural parameters and frequency-variant electromagnetic parameters.And then a frequency domain reliability analysis method based on univariate dimension reduction method is proposed, which provides an effective calculation tool for electromagnetic frequency domain reliability. In electromagnetic problems, performance indicators usually vary with frequency. The method firstly discretizes the frequency-variant performance indicator function into a series of frequency points' functions, and then transforms the frequency domain reliability problem into a series system reliability problem of discrete frequency points' functions. Secondly, the univariate dimension reduction method is introduced to solve the probability distribution functions and correlation coefficients of discrete frequency points' functions in the system. Finally, according to the above calculation results, the series system reliability can be solved to obtain the frequency domain reliability, and the cumulative distribution function of the performance indicator can also be obtained. In this study,Monte Carlo simulation is adopted to demonstrate the validity of the frequency domain reliability analysis method. Three examples are investigated to demonstrate the accuracy and efficiency of the proposed method.
基金Sponsored by the National Basic Research Program of China(Grant No.2015CB057400)
文摘The transient proper orthogonal decomposition(TPOD) method is used to study dynamic behaviors of the reduced rotor-bearing models,and the fault-free model is compared with the models with looseness fault.A 22 degree of freedoms(DOFs) rotor model supported by bearings is established.Both one end and two ends pedestal looseness of the liquid-film bearings are studied by analyzing the time history and the frequency-spectrum curves.The effects of the initial displacement and velocity values to frequency components of the original systems and the dimension reduction efficiency are discussed.Moreover,the effects of variation of initial conditions on the efficiency of the TPOD method are studied.Reduced models can provide guidance significance from the perspectives of the theory and numerical simplification to discuss the characteristics of pedestal looseness fault.
基金Supported by the National Natural Science Foundation of China(60772066)
文摘The feature-selection problem in training AdaBoost classifiers is addressed in this paper. A working feature subset is generated by adopting a novel feature subset selection method based on the partial least square (PLS) regression, and then trained and selected from this feature subset in Boosting. The experiments show that the proposed PLS-based feature-selection method outperforms the current feature ranking method and the random sampling method.
基金supported by Research Grants Council of Hong Kong under Grant No.17301214HKU CERG Grants,Fundamental Research Funds for the Central Universities+2 种基金the Research Funds of Renmin University of ChinaHung Hing Ying Physical Research Grantthe Natural Science Foundation of China under Grant No.11271144
文摘Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigen- matrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.
文摘Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method.
文摘This paper explores the realization of robotic motion planning, especially Findpath problem, which is a basic motion planning problem that arises in the development of robotics. Findpath means: Give the initial and desired final configurations of a robotic arm in 3-dimensionnl space, and give descriptions of the obstacles in the space, determine whether there is a continuous collision-free motion of the robotic arm from one configure- tion to the other and find such a motion if it exists. There are several branches of approach in motion planning area, but in reality the important things are feasibility, efficiency and accuracy of the method. In this paper ac- cording to the concepts of Configuration Space (C-Space) and Rotation Mapping Graph (RMG) discussed in [1], a topological method named Dimension Reduction Method (DRM) for investigating the connectivity of the RMG (or the topologic structure of the RMG )is presented by using topologic technique. Based on this ap- proach the Findpath problem is thus transformed to that of finding a connected way in a finite Characteristic Network (CN). The method has shown great potentiality in practice. Here a simulation system is designed to embody DRM and it is in sight that DRM can he adopted in the first overall planning of real robot sys- tem in the near future.
文摘In recent years,the network continues to enter people’s lives,followed by network security issues that continue to appear,causing substantial economic losses to the world.As an effective method to tackle the network security issues,intrusion detection system has been widely used and studied.In this paper,the NSL-KDD data set is used to reduce the dimension of data features,remove the features of low correlation and high interference,and improve the computational efficiency.To improve the detection rate and accuracy of intrusion detection,this paper introduces the particle method for the first time that we call it intrusion detection with particle(IDP).To illustrate the effectiveness of this method,experiments are carried out on three kinds of data-before dimension reduction,after dimension reduction and importing particle method based on dimension reduction.By comparing the results of DT,NN,SVM,K-NN,and NB,it is proved that the particle method can effectively improve the intrusion detection rate.
基金supported by the National Natural Science Foundation of China (Grant Nos.10772056,10632040)the Natural Science Foundation of Heilongjiang Province,China (Grant No.ZJG0704)the Harbin Science & Technology Innovative Foundation,China (Grant No.2007RFLXG009)
文摘The concept of approximate inertial manifold (AIM) is extended to develop a kind of nonlinear order reduction technique for non-autonomous nonlinear systems in second-order form in this paper.Using the modal transformation,a large nonlinear dynamical system is split into a 'master' subsystem,a 'slave' subsystem,and a 'negligible' subsystem.Accordingly,a novel order reduction method (Method I) is developed to construct a low order subsystem by neglecting the 'negligible' subsystem and slaving the 'slave' subsystem into the 'master' subsystem using the extended AIM.As a comparison,Method II accounting for the effects of both 'slave' subsystem and the 'negligible' subsystem is also applied to obtain the reduced order subsystem.Then,a typical 5-degree-of-freedom nonlinear dynamical system is given to compare the accuracy and efficiency of the traditional Galerkin truncation (ignoring the contributions of the slave and negligible subsystems),Method I and Method II.It is shown that Method I gives a considerable increase in accuracy for little computational cost in comparison with the standard Galerkin method,and produces almost the same accuracy as Method II.Finally,a 3-degree-of-freedom nonlinear dynamical system is analyzed by using the analytic method for showing predominance and convenience of Method I to obtain the analytically reduced order system.