We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a...We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a nonlinear flow towards a lower-dimensional subspace;the projection onto the subspace gives the low-dimensional embedding.Training the model involves identifying the nonlinear flow and the subspace.Following the equation discovery method,we represent the vector field that defines the flow using a linear combination of dictionary elements,where each element is a pre-specified linear/nonlinear candidate function.A regularization term for the average total kinetic energy is also introduced and motivated by the optimal transport theory.We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method.We also show how the DDR method can be trained using a gradient-based optimization method,where the gradients are computed using the adjoint method from the optimal control theory.The DDR method is implemented and compared on synthetic and example data sets to other dimension reduction methods,including the PCA,t-SNE,and Umap.展开更多
The equipment used in various fields contains an increasing number of parts with curved surfaces of increasing size.Five-axis computer numerical control(CNC)milling is the main parts machining method,while dynamics an...The equipment used in various fields contains an increasing number of parts with curved surfaces of increasing size.Five-axis computer numerical control(CNC)milling is the main parts machining method,while dynamics analysis has always been a research hotspot.The cutting conditions determined by the cutter axis,tool path,and workpiece geometry are complex and changeable,which has made dynamics research a major challenge.For this reason,this paper introduces the innovative idea of applying dimension reduction and mapping to the five-axis machining of curved surfaces,and proposes an efficient dynamics analysis model.To simplify the research object,the cutter position points along the tool path were discretized into inclined plane five-axis machining.The cutter dip angle and feed deflection angle were used to define the spatial position relationship in five-axis machining.These were then taken as the new base variables to construct an abstract two-dimensional space and establish the mapping relationship between the cutter position point and space point sets to further simplify the dimensions of the research object.Based on the in-cut cutting edge solved by the space limitation method,the dynamics of the inclined plane five-axis machining unit were studied,and the results were uniformly stored in the abstract space to produce a database.Finally,the prediction of the milling force and vibration state along the tool path became a data extraction process that significantly improved efficiency.Two experiments were also conducted which proved the accuracy and efficiency of the proposed dynamics analysis model.This study has great potential for the online synchronization of intelligent machining of large surfaces.展开更多
<strong>Purpose:</strong><span style="font-family:;" "=""><span style="font-family:Verdana;"> This study sought to review the characteristics, strengths, weak...<strong>Purpose:</strong><span style="font-family:;" "=""><span style="font-family:Verdana;"> This study sought to review the characteristics, strengths, weaknesses variants, applications areas and data types applied on the various </span><span><span style="font-family:Verdana;">Dimension Reduction techniques. </span><b><span style="font-family:Verdana;">Methodology: </span></b><span style="font-family:Verdana;">The most commonly used databases employed to search for the papers were ScienceDirect, Scopus, Google Scholar, IEEE Xplore and Mendeley. An integrative review was used for the study where </span></span></span><span style="font-family:Verdana;">341</span><span style="font-family:;" "=""><span style="font-family:Verdana;"> papers were reviewed. </span><b><span style="font-family:Verdana;">Results:</span></b><span style="font-family:Verdana;"> The linear techniques considered were Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Singular Value Decomposition (SVD), Latent Semantic Analysis (LSA), Locality Preserving Projections (LPP), Independent Component Analysis (ICA) and Project Pursuit (PP). The non-linear techniques which were developed to work with applications that ha</span></span><span style="font-family:Verdana;">ve</span><span style="font-family:Verdana;"> complex non-linear structures considered were Kernel Principal Component Analysis (KPC</span><span style="font-family:Verdana;">A), Multi</span><span style="font-family:Verdana;">-</span><span style="font-family:;" "=""><span style="font-family:Verdana;">dimensional Scaling (MDS), Isomap, Locally Linear Embedding (LLE), Self-Organizing Map (SOM), Latent Vector Quantization (LVQ), t-Stochastic </span><span style="font-family:Verdana;">neighbor embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). DR techniques can further be categorized into supervised, unsupervised and more recently semi-supervised learning methods. The supervised versions are the LDA and LVQ. All the other techniques are unsupervised. Supervised variants of PCA, LPP, KPCA and MDS have </span><span style="font-family:Verdana;">been developed. Supervised and semi-supervised variants of PP and t-SNE have also been developed and a semi supervised version of the LDA has been developed. </span><b><span style="font-family:Verdana;">Conclusion:</span></b><span style="font-family:Verdana;"> The various application areas, strengths, weaknesses and variants of the DR techniques were explored. The different data types that have been applied on the various DR techniques were also explored.</span></span>展开更多
An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow ...An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.展开更多
In the underwater waveguide,the conventional adaptive subspace detector(ASD),derived by using the generalized likelihood ratio test(GLRT)theory,suffers from a significant degradation in detection performance when the ...In the underwater waveguide,the conventional adaptive subspace detector(ASD),derived by using the generalized likelihood ratio test(GLRT)theory,suffers from a significant degradation in detection performance when the samplings of training data are deficient.This paper proposes a dimension-reduced approach to alleviate this problem.The dimension reduction includes two steps:firstly,the full array is divided into several subarrays;secondly,the test data and the training data at each subarray are transformed into the modal domain from the hydrophone domain.Then the modal-domain test data and training data at each subarray are processed to formulate the subarray statistic by using the GLRT theory.The final test statistic of the dimension-reduced ASD(DR-ASD)is obtained by summing all the subarray statistics.After the dimension reduction,the unknown parameters can be estimated more accurately so the DR-ASD achieves a better detection performance than the ASD.In order to achieve the optimal detection performance,the processing gain of the DR-ASD is deduced to choose a proper number of subarrays.Simulation experiments verify the improved detection performance of the DR-ASD compared with the ASD.展开更多
Sustainable Development Capacity (SDC) is a comprehensive concept. In order to obtain a relatively objective evaluation of it, many indices of various aspects are often used in assessing index systems. However, the ov...Sustainable Development Capacity (SDC) is a comprehensive concept. In order to obtain a relatively objective evaluation of it, many indices of various aspects are often used in assessing index systems. However, the overlapping information of indices is a frequent source deviating the result from the truth. In this paper, 48 indices are selected as original variables in assessing SDC of China's coastal areas. The mathematical method of dimension reducing treatment is used for eliminating the overlapping information in 48 variables. Five new comprehensive indices are extracted bearing efficient messages of original indices. On the base of new indices values, the sequencing of 12 coastal areas SDC is gained, and five patterns of sustainable development regions are sorted. Then, the leading factors and their relations of SDC in these patterns are analyzed. The gains of research are discussed in the end.展开更多
The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for...The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for noise-robust VAD. The contribution of dynamic features to likelihood score can be increased via the method, which improves consequently the noise robustness of VAD. Divergence based dimension reduction method is proposed for saving computation, which reduces these feature dimensions with smaller divergence value at the cost of degrading the performance a little. Experimental results on Aurora Ⅱ database show that the detection performance in noise environments can remarkably be improved by the proposed method when the model trained in clean data is used to detect speech endpoints. Using weighting likelihood on the dimension-reduced features obtains comparable, even better, performance compared to original full-dimensional feature.展开更多
The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA'...The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA's kernel parameters for improving its feature dimension reduction result. In this paper, a fitness function was established by use of the ideal of Fisher discrimination function firstly. Then the global optimal solution of fitness function was searched by particle swarm optimization( PSO) algorithm and a multi-state information dimension reduction algorithm based on PSO-KICA was established. Finally,the validity of this algorithm to enhance the precision of feature dimension reduction has been proven.展开更多
In our previous work, we have given an algorithm for segmenting a simplex in the n-dimensional space into rt n+ 1 polyhedrons and provided map F which maps the n-dimensional unit cube to these polyhedrons. In this pa...In our previous work, we have given an algorithm for segmenting a simplex in the n-dimensional space into rt n+ 1 polyhedrons and provided map F which maps the n-dimensional unit cube to these polyhedrons. In this paper, we prove that the map F is a one to one correspondence at least in lower dimensional spaces (n _〈 3). Moreover, we propose the approximating subdivision and the interpolatory subdivision schemes and the estimation of computational complexity for triangular Bézier patches on a 2-dimensional space. Finally, we compare our schemes with Goldman's in computational complexity and speed.展开更多
This paper presents a new dimension reduction strategy for medium and large-scale linear programming problems. The proposed method uses a subset of the original constraints and combines two algorithms: the weighted av...This paper presents a new dimension reduction strategy for medium and large-scale linear programming problems. The proposed method uses a subset of the original constraints and combines two algorithms: the weighted average and the cosine simplex algorithm. The first approach identifies binding constraints by using the weighted average of each constraint, whereas the second algorithm is based on the cosine similarity between the vector of the objective function and the constraints. These two approaches are complementary, and when used together, they locate the essential subset of initial constraints required for solving medium and large-scale linear programming problems. After reducing the dimension of the linear programming problem using the subset of the essential constraints, the solution method can be chosen from any suitable method for linear programming. The proposed approach was applied to a set of well-known benchmarks as well as more than 2000 random medium and large-scale linear programming problems. The results are promising, indicating that the new approach contributes to the reduction of both the size of the problems and the total number of iterations required. A tree-based classification model also confirmed the need for combining the two approaches. A detailed numerical example, the general numerical results, and the statistical analysis for the decision tree procedure are presented.展开更多
Large-scale cooling energy system has developed well in the past decade.However,its optimization is still a problem to be tackled due to the nonlinearity and large scale of existing systems.Reducing the scale of probl...Large-scale cooling energy system has developed well in the past decade.However,its optimization is still a problem to be tackled due to the nonlinearity and large scale of existing systems.Reducing the scale of problems without oversimplifying the actual system model is a big challenge nowadays.This paper proposes a dimension reduction-based many-objective optimization(DRMO)method to solve an accurate nonlinear model of a practical large-scale cooling energy system.In the first stage,many-objective and many-variable of the large system are pre-processed to reduce the overall scale of the optimization problem.The relationships between many objectives are analyzed to find a few representative objectives.Key control variables are extracted to reduce the dimension of variables and the number of equality constraints.In the second stage,the manyobjective group search optimization(GSO)method is used to solve the low-dimensional nonlinear model,and a Pareto-front is obtained.In the final stage,candidate solutions along the Paretofront are graded on many-objective levels of system operators.The candidate solution with the highest average utility value is selected as the best running mode.Simulations are carried out on a 619-node-614-branch cooling system,and results show the ability of the proposed method in solving large-scale system operation problems.展开更多
Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretabilit...Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretability of these models presents a persistent challenge.Design/methodology/approach:This paper proposes two innovative dimensionality reduction models based on integer programming(DRMBIP).These models assess compactness through the correlation of each indicator with its class center,while separation is evaluated by the correlation between different class centers.In contrast to DRMBIP-p,the DRMBIP-v considers the threshold parameter as a variable aiming to optimally balances both compactness and separation.Findings:This study,getting data from the Global Health Observatory(GHO),investigates 141 indicators that influence life expectancy.The findings reveal that DRMBIP-p effectively reduces the dimensionality of data,ensuring compactness.It also maintains compatibility with other models.Additionally,DRMBIP-v finds the optimal result,showing exceptional separation.Visualization of the results reveals that all classes have a high compactness.Research limitations:The DRMBIP-p requires the input of the correlation threshold parameter,which plays a pivotal role in the effectiveness of the final dimensionality reduction results.In the DRMBIP-v,modifying the threshold parameter to variable potentially emphasizes either separation or compactness.This necessitates an artificial adjustment to the overflow component within the objective function.Practical implications:The DRMBIP presented in this paper is adept at uncovering the primary geometric structures within high-dimensional indicators.Validated by life expectancy data,this paper demonstrates potential to assist data miners with the reduction of data dimensions.Originality/value:To our knowledge,this is the first time that integer programming has been used to build a dimensionality reduction model with indicator filtering.It not only has applications in life expectancy,but also has obvious advantages in data mining work that requires precise class centers.展开更多
The high dimensionalhyperspectral image classification is a challenging task due to the spectral feature vectors.The high correlation between these features and the noises greatly affects the classification performanc...The high dimensionalhyperspectral image classification is a challenging task due to the spectral feature vectors.The high correlation between these features and the noises greatly affects the classification performances.To overcome this,dimensionality reduction techniques are widely used.Traditional image processing applications recently propose numerous deep learning models.However,in hyperspectral image classification,the features of deep learning models are less explored.Thus,for efficient hyperspectral image classification,a depth-wise convolutional neural network is presented in this research work.To handle the dimensionality issue in the classification process,an optimized self-organized map model is employed using a water strider optimization algorithm.The network parameters of the self-organized map are optimized by the water strider optimization which reduces the dimensionality issues and enhances the classification performances.Standard datasets such as Indian Pines and the University of Pavia(UP)are considered for experimental analysis.Existing dimensionality reduction methods like Enhanced Hybrid-Graph Discriminant Learning(EHGDL),local geometric structure Fisher analysis(LGSFA),Discriminant Hyper-Laplacian projection(DHLP),Group-based tensor model(GBTM),and Lower rank tensor approximation(LRTA)methods are compared with proposed optimized SOM model.Results confirm the superior performance of the proposed model of 98.22%accuracy for the Indian pines dataset and 98.21%accuracy for the University of Pavia dataset over the existing maximum likelihood classifier,and Support vector machine(SVM).展开更多
Dimension reduction provides a powerful means of reducing the number of random variables under consideration.However,there were many similar tuples in large datasets,and before reducing the dimension of the dataset,we...Dimension reduction provides a powerful means of reducing the number of random variables under consideration.However,there were many similar tuples in large datasets,and before reducing the dimension of the dataset,we removed some similar tuples to retain the main information of the dataset while accelerating the dimension reduc-tion.Accordingly,we propose a dimension reduction technique based on biased sampling,a new procedure that incorporates features of both dimensional reduction and biased sampling to obtain a computationally efficient means of reducing the number of random variables under consid-eration.In this paper,we choose Principal Components Analysis(PCA)as the main dimensional reduction algorithm to study,and we show how this approach works.展开更多
With the extensive application of large-scale array antennas,the increasing number of array elements leads to the increasing dimension of received signals,making it difficult to meet the real-time requirement of direc...With the extensive application of large-scale array antennas,the increasing number of array elements leads to the increasing dimension of received signals,making it difficult to meet the real-time requirement of direction of arrival(DOA)estimation due to the computational complexity of algorithms.Traditional subspace algorithms require estimation of the covariance matrix,which has high computational complexity and is prone to producing spurious peaks.In order to reduce the computational complexity of DOA estimation algorithms and improve their estimation accuracy under large array elements,this paper proposes a DOA estimation method based on Krylov subspace and weighted l_(1)-norm.The method uses the multistage Wiener filter(MSWF)iteration to solve the basis of the Krylov subspace as an estimate of the signal subspace,further uses the measurement matrix to reduce the dimensionality of the signal subspace observation,constructs a weighted matrix,and combines the sparse reconstruction to establish a convex optimization function based on the residual sum of squares and weighted l_(1)-norm to solve the target DOA.Simulation results show that the proposed method has high resolution under large array conditions,effectively suppresses spurious peaks,reduces computational complexity,and has good robustness for low signal to noise ratio(SNR)environment.展开更多
In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projec...In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.展开更多
Single-cell RNA sequencing(scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell subpopulations and lineages, with an effe...Single-cell RNA sequencing(scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell subpopulations and lineages, with an effective low-dimensional representation and visualization of the original scRNA-Seq data. At the single cell level, the transcriptional fluctuations are much larger than the average of a cell population, and the low amount of RNA transcripts will increase the rate of technical dropout events. Therefore, scRNA-seq data are much noisier than traditional bulk RNA-seq data. In this study, we proposed the deep variational autoencoder for scRNA-seq data(VASC), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. VASC can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on over 20 datasets, VASC shows superior performances in most cases and exhibits broader dataset compatibility compared to four state-of-the-art dimension reduction and visualization methods. In addition, VASC provides better representations for very rare cell populations in the 2D visualization. As a case study, VASC successfully re-establishes the cell dynamics in pre-implantation embryos and identifies several candidate marker genes associated with early embryo development. Moreover, VASC also performs well on a 10× Genomics dataset with more cells and higher dropout rate.展开更多
In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems ...In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.展开更多
We are concerned with partial dimension reduction for the conditional mean function in the presence of controlling variables.We suggest a profile least squares approach to perform partial dimension reduction for a gen...We are concerned with partial dimension reduction for the conditional mean function in the presence of controlling variables.We suggest a profile least squares approach to perform partial dimension reduction for a general class of semi-parametric models.The asymptotic properties of the resulting estimates for the central partial mean subspace and the mean function are provided.In addition,a Wald-type test is proposed to evaluate a linear hypothesis of the central partial mean subspace,and a generalized likelihood ratio test is constructed to check whether the nonparametric mean function has a specific parametric form.These tests can be used to evaluate whether there exist interactions between the covariates and the controlling variables,and if any,in what form.A Bayesian information criterion(BIC)-type criterion is applied to determine the structural dimension of the central partial mean subspace.Its consistency is also established.Numerical studies through simulations and real data examples are conducted to demonstrate the power and utility of the proposed semi-parametric approaches.展开更多
Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on S...Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.展开更多
文摘We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems,which we call the dynamical dimension reduction(DDR).In the DDR model,each point is evolved via a nonlinear flow towards a lower-dimensional subspace;the projection onto the subspace gives the low-dimensional embedding.Training the model involves identifying the nonlinear flow and the subspace.Following the equation discovery method,we represent the vector field that defines the flow using a linear combination of dictionary elements,where each element is a pre-specified linear/nonlinear candidate function.A regularization term for the average total kinetic energy is also introduced and motivated by the optimal transport theory.We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method.We also show how the DDR method can be trained using a gradient-based optimization method,where the gradients are computed using the adjoint method from the optimal control theory.The DDR method is implemented and compared on synthetic and example data sets to other dimension reduction methods,including the PCA,t-SNE,and Umap.
基金Supported by National Natural Science Foundation of China(Grant Nos.52005078,U1908231,52075076).
文摘The equipment used in various fields contains an increasing number of parts with curved surfaces of increasing size.Five-axis computer numerical control(CNC)milling is the main parts machining method,while dynamics analysis has always been a research hotspot.The cutting conditions determined by the cutter axis,tool path,and workpiece geometry are complex and changeable,which has made dynamics research a major challenge.For this reason,this paper introduces the innovative idea of applying dimension reduction and mapping to the five-axis machining of curved surfaces,and proposes an efficient dynamics analysis model.To simplify the research object,the cutter position points along the tool path were discretized into inclined plane five-axis machining.The cutter dip angle and feed deflection angle were used to define the spatial position relationship in five-axis machining.These were then taken as the new base variables to construct an abstract two-dimensional space and establish the mapping relationship between the cutter position point and space point sets to further simplify the dimensions of the research object.Based on the in-cut cutting edge solved by the space limitation method,the dynamics of the inclined plane five-axis machining unit were studied,and the results were uniformly stored in the abstract space to produce a database.Finally,the prediction of the milling force and vibration state along the tool path became a data extraction process that significantly improved efficiency.Two experiments were also conducted which proved the accuracy and efficiency of the proposed dynamics analysis model.This study has great potential for the online synchronization of intelligent machining of large surfaces.
文摘<strong>Purpose:</strong><span style="font-family:;" "=""><span style="font-family:Verdana;"> This study sought to review the characteristics, strengths, weaknesses variants, applications areas and data types applied on the various </span><span><span style="font-family:Verdana;">Dimension Reduction techniques. </span><b><span style="font-family:Verdana;">Methodology: </span></b><span style="font-family:Verdana;">The most commonly used databases employed to search for the papers were ScienceDirect, Scopus, Google Scholar, IEEE Xplore and Mendeley. An integrative review was used for the study where </span></span></span><span style="font-family:Verdana;">341</span><span style="font-family:;" "=""><span style="font-family:Verdana;"> papers were reviewed. </span><b><span style="font-family:Verdana;">Results:</span></b><span style="font-family:Verdana;"> The linear techniques considered were Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Singular Value Decomposition (SVD), Latent Semantic Analysis (LSA), Locality Preserving Projections (LPP), Independent Component Analysis (ICA) and Project Pursuit (PP). The non-linear techniques which were developed to work with applications that ha</span></span><span style="font-family:Verdana;">ve</span><span style="font-family:Verdana;"> complex non-linear structures considered were Kernel Principal Component Analysis (KPC</span><span style="font-family:Verdana;">A), Multi</span><span style="font-family:Verdana;">-</span><span style="font-family:;" "=""><span style="font-family:Verdana;">dimensional Scaling (MDS), Isomap, Locally Linear Embedding (LLE), Self-Organizing Map (SOM), Latent Vector Quantization (LVQ), t-Stochastic </span><span style="font-family:Verdana;">neighbor embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). DR techniques can further be categorized into supervised, unsupervised and more recently semi-supervised learning methods. The supervised versions are the LDA and LVQ. All the other techniques are unsupervised. Supervised variants of PCA, LPP, KPCA and MDS have </span><span style="font-family:Verdana;">been developed. Supervised and semi-supervised variants of PP and t-SNE have also been developed and a semi supervised version of the LDA has been developed. </span><b><span style="font-family:Verdana;">Conclusion:</span></b><span style="font-family:Verdana;"> The various application areas, strengths, weaknesses and variants of the DR techniques were explored. The different data types that have been applied on the various DR techniques were also explored.</span></span>
基金Project supported by the National Natural Science Foundation of China(Nos.50936005,51576182,and 11172296)
文摘An automated method to optimize the definition of the progress variables in the flamelet-based dimension reduction is proposed. The performance of these optimized progress variables in coupling the flamelets and flow solver is presented. In the proposed method, the progress variables are defined according to the first two principal components (PCs) from the principal component analysis (PCA) or kernel-density-weighted PCA (KEDPCA) of a set of flamelets. These flamelets can then be mapped to these new progress variables instead of the mixture fraction/conventional progress variables. Thus, a new chemistry look-up table is constructed. A priori validation of these optimized progress variables and the new chemistry table is implemented in a CH4/N2/air lift-off flame. The reconstruction of the lift-off flame shows that the optimized progress variables perform better than the conventional ones, especially in the high temperature area. The coefficient determinations (R2 statistics) show that the KEDPCA performs slightly better than the PCA except for some minor species. The main advantage of the KEDPCA is that it is less sensitive to the database. Meanwhile, the criteria for the optimization are proposed and discussed. The constraint that the progress variables should monotonically evolve from fresh gas to burnt gas is analyzed in detail.
基金the National Natural Science Foundation of China (Grant No. 11534009, 11974285) to provide fund for conducting this research
文摘In the underwater waveguide,the conventional adaptive subspace detector(ASD),derived by using the generalized likelihood ratio test(GLRT)theory,suffers from a significant degradation in detection performance when the samplings of training data are deficient.This paper proposes a dimension-reduced approach to alleviate this problem.The dimension reduction includes two steps:firstly,the full array is divided into several subarrays;secondly,the test data and the training data at each subarray are transformed into the modal domain from the hydrophone domain.Then the modal-domain test data and training data at each subarray are processed to formulate the subarray statistic by using the GLRT theory.The final test statistic of the dimension-reduced ASD(DR-ASD)is obtained by summing all the subarray statistics.After the dimension reduction,the unknown parameters can be estimated more accurately so the DR-ASD achieves a better detection performance than the ASD.In order to achieve the optimal detection performance,the processing gain of the DR-ASD is deduced to choose a proper number of subarrays.Simulation experiments verify the improved detection performance of the DR-ASD compared with the ASD.
基金Knowledge Innovation Project of Chinese Academy of Sciences (KZCX2-307-05) Knowledge Innovation Project of Institute of Geograp
文摘Sustainable Development Capacity (SDC) is a comprehensive concept. In order to obtain a relatively objective evaluation of it, many indices of various aspects are often used in assessing index systems. However, the overlapping information of indices is a frequent source deviating the result from the truth. In this paper, 48 indices are selected as original variables in assessing SDC of China's coastal areas. The mathematical method of dimension reducing treatment is used for eliminating the overlapping information in 48 variables. Five new comprehensive indices are extracted bearing efficient messages of original indices. On the base of new indices values, the sequencing of 12 coastal areas SDC is gained, and five patterns of sustainable development regions are sorted. Then, the leading factors and their relations of SDC in these patterns are analyzed. The gains of research are discussed in the end.
基金Supported by the National Basic Research Program of China (973 Program) (No.2007CB311104)
文摘The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for noise-robust VAD. The contribution of dynamic features to likelihood score can be increased via the method, which improves consequently the noise robustness of VAD. Divergence based dimension reduction method is proposed for saving computation, which reduces these feature dimensions with smaller divergence value at the cost of degrading the performance a little. Experimental results on Aurora Ⅱ database show that the detection performance in noise environments can remarkably be improved by the proposed method when the model trained in clean data is used to detect speech endpoints. Using weighting likelihood on the dimension-reduced features obtains comparable, even better, performance compared to original full-dimensional feature.
文摘The precision of the kernel independent component analysis( KICA) algorithm depends on the type and parameter values of kernel function. Therefore,it's of great significance to study the choice method of KICA's kernel parameters for improving its feature dimension reduction result. In this paper, a fitness function was established by use of the ideal of Fisher discrimination function firstly. Then the global optimal solution of fitness function was searched by particle swarm optimization( PSO) algorithm and a multi-state information dimension reduction algorithm based on PSO-KICA was established. Finally,the validity of this algorithm to enhance the precision of feature dimension reduction has been proven.
文摘In our previous work, we have given an algorithm for segmenting a simplex in the n-dimensional space into rt n+ 1 polyhedrons and provided map F which maps the n-dimensional unit cube to these polyhedrons. In this paper, we prove that the map F is a one to one correspondence at least in lower dimensional spaces (n _〈 3). Moreover, we propose the approximating subdivision and the interpolatory subdivision schemes and the estimation of computational complexity for triangular Bézier patches on a 2-dimensional space. Finally, we compare our schemes with Goldman's in computational complexity and speed.
文摘This paper presents a new dimension reduction strategy for medium and large-scale linear programming problems. The proposed method uses a subset of the original constraints and combines two algorithms: the weighted average and the cosine simplex algorithm. The first approach identifies binding constraints by using the weighted average of each constraint, whereas the second algorithm is based on the cosine similarity between the vector of the objective function and the constraints. These two approaches are complementary, and when used together, they locate the essential subset of initial constraints required for solving medium and large-scale linear programming problems. After reducing the dimension of the linear programming problem using the subset of the essential constraints, the solution method can be chosen from any suitable method for linear programming. The proposed approach was applied to a set of well-known benchmarks as well as more than 2000 random medium and large-scale linear programming problems. The results are promising, indicating that the new approach contributes to the reduction of both the size of the problems and the total number of iterations required. A tree-based classification model also confirmed the need for combining the two approaches. A detailed numerical example, the general numerical results, and the statistical analysis for the decision tree procedure are presented.
基金supported by the Key-Area Research and Development Program of Guangdong Province(2020B010166004)Natural Science Foundation of China(52007066).
文摘Large-scale cooling energy system has developed well in the past decade.However,its optimization is still a problem to be tackled due to the nonlinearity and large scale of existing systems.Reducing the scale of problems without oversimplifying the actual system model is a big challenge nowadays.This paper proposes a dimension reduction-based many-objective optimization(DRMO)method to solve an accurate nonlinear model of a practical large-scale cooling energy system.In the first stage,many-objective and many-variable of the large system are pre-processed to reduce the overall scale of the optimization problem.The relationships between many objectives are analyzed to find a few representative objectives.Key control variables are extracted to reduce the dimension of variables and the number of equality constraints.In the second stage,the manyobjective group search optimization(GSO)method is used to solve the low-dimensional nonlinear model,and a Pareto-front is obtained.In the final stage,candidate solutions along the Paretofront are graded on many-objective levels of system operators.The candidate solution with the highest average utility value is selected as the best running mode.Simulations are carried out on a 619-node-614-branch cooling system,and results show the ability of the proposed method in solving large-scale system operation problems.
基金supported by the National Natural Science Foundation of China (Nos.72371115)the Natural Science Foundation of Jilin,China (No.20230101184JC)。
文摘Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretability of these models presents a persistent challenge.Design/methodology/approach:This paper proposes two innovative dimensionality reduction models based on integer programming(DRMBIP).These models assess compactness through the correlation of each indicator with its class center,while separation is evaluated by the correlation between different class centers.In contrast to DRMBIP-p,the DRMBIP-v considers the threshold parameter as a variable aiming to optimally balances both compactness and separation.Findings:This study,getting data from the Global Health Observatory(GHO),investigates 141 indicators that influence life expectancy.The findings reveal that DRMBIP-p effectively reduces the dimensionality of data,ensuring compactness.It also maintains compatibility with other models.Additionally,DRMBIP-v finds the optimal result,showing exceptional separation.Visualization of the results reveals that all classes have a high compactness.Research limitations:The DRMBIP-p requires the input of the correlation threshold parameter,which plays a pivotal role in the effectiveness of the final dimensionality reduction results.In the DRMBIP-v,modifying the threshold parameter to variable potentially emphasizes either separation or compactness.This necessitates an artificial adjustment to the overflow component within the objective function.Practical implications:The DRMBIP presented in this paper is adept at uncovering the primary geometric structures within high-dimensional indicators.Validated by life expectancy data,this paper demonstrates potential to assist data miners with the reduction of data dimensions.Originality/value:To our knowledge,this is the first time that integer programming has been used to build a dimensionality reduction model with indicator filtering.It not only has applications in life expectancy,but also has obvious advantages in data mining work that requires precise class centers.
文摘The high dimensionalhyperspectral image classification is a challenging task due to the spectral feature vectors.The high correlation between these features and the noises greatly affects the classification performances.To overcome this,dimensionality reduction techniques are widely used.Traditional image processing applications recently propose numerous deep learning models.However,in hyperspectral image classification,the features of deep learning models are less explored.Thus,for efficient hyperspectral image classification,a depth-wise convolutional neural network is presented in this research work.To handle the dimensionality issue in the classification process,an optimized self-organized map model is employed using a water strider optimization algorithm.The network parameters of the self-organized map are optimized by the water strider optimization which reduces the dimensionality issues and enhances the classification performances.Standard datasets such as Indian Pines and the University of Pavia(UP)are considered for experimental analysis.Existing dimensionality reduction methods like Enhanced Hybrid-Graph Discriminant Learning(EHGDL),local geometric structure Fisher analysis(LGSFA),Discriminant Hyper-Laplacian projection(DHLP),Group-based tensor model(GBTM),and Lower rank tensor approximation(LRTA)methods are compared with proposed optimized SOM model.Results confirm the superior performance of the proposed model of 98.22%accuracy for the Indian pines dataset and 98.21%accuracy for the University of Pavia dataset over the existing maximum likelihood classifier,and Support vector machine(SVM).
基金This paper was supported by The National Key Research and Development Program of China(2020YFB1006104)The Opening Project of Intelligent Policing Key Laboratory of Sichuan Province(ZNJW2023KFZD004)+1 种基金Sichuan Police College(CJKY202001)NSFC grant(62232005).
文摘Dimension reduction provides a powerful means of reducing the number of random variables under consideration.However,there were many similar tuples in large datasets,and before reducing the dimension of the dataset,we removed some similar tuples to retain the main information of the dataset while accelerating the dimension reduc-tion.Accordingly,we propose a dimension reduction technique based on biased sampling,a new procedure that incorporates features of both dimensional reduction and biased sampling to obtain a computationally efficient means of reducing the number of random variables under consid-eration.In this paper,we choose Principal Components Analysis(PCA)as the main dimensional reduction algorithm to study,and we show how this approach works.
基金supported by the National Basic Research Program of China。
文摘With the extensive application of large-scale array antennas,the increasing number of array elements leads to the increasing dimension of received signals,making it difficult to meet the real-time requirement of direction of arrival(DOA)estimation due to the computational complexity of algorithms.Traditional subspace algorithms require estimation of the covariance matrix,which has high computational complexity and is prone to producing spurious peaks.In order to reduce the computational complexity of DOA estimation algorithms and improve their estimation accuracy under large array elements,this paper proposes a DOA estimation method based on Krylov subspace and weighted l_(1)-norm.The method uses the multistage Wiener filter(MSWF)iteration to solve the basis of the Krylov subspace as an estimate of the signal subspace,further uses the measurement matrix to reduce the dimensionality of the signal subspace observation,constructs a weighted matrix,and combines the sparse reconstruction to establish a convex optimization function based on the residual sum of squares and weighted l_(1)-norm to solve the target DOA.Simulation results show that the proposed method has high resolution under large array conditions,effectively suppresses spurious peaks,reduces computational complexity,and has good robustness for low signal to noise ratio(SNR)environment.
基金The National Natural Science Foundation of China(No.61231002,61273266)the Ph.D.Program Foundation of Ministry of Education of China(No.20110092130004)China Postdoctoral Science Foundation(No.2015M571637)
文摘In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.
基金supported by the National Natural Science Foundation of China (Grant Nos.61370035 and 31361163004)Tsinghua University Initiative Scientific Research Program
文摘Single-cell RNA sequencing(scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell subpopulations and lineages, with an effective low-dimensional representation and visualization of the original scRNA-Seq data. At the single cell level, the transcriptional fluctuations are much larger than the average of a cell population, and the low amount of RNA transcripts will increase the rate of technical dropout events. Therefore, scRNA-seq data are much noisier than traditional bulk RNA-seq data. In this study, we proposed the deep variational autoencoder for scRNA-seq data(VASC), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. VASC can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on over 20 datasets, VASC shows superior performances in most cases and exhibits broader dataset compatibility compared to four state-of-the-art dimension reduction and visualization methods. In addition, VASC provides better representations for very rare cell populations in the 2D visualization. As a case study, VASC successfully re-establishes the cell dynamics in pre-implantation embryos and identifies several candidate marker genes associated with early embryo development. Moreover, VASC also performs well on a 10× Genomics dataset with more cells and higher dropout rate.
基金supported by National High Technology Research and Development Program of China (863 Program)(No. 2009AA04Z162)National Nature Science Foundation of China(No. 60825302, No. 60934007, No. 61074061)+1 种基金Program of Shanghai Subject Chief Scientist,"Shu Guang" project supported by Shang-hai Municipal Education Commission and Shanghai Education Development FoundationKey Project of Shanghai Science and Technology Commission, China (No. 10JC1403400)
文摘In this paper, a low-dimensional multiple-input and multiple-output (MIMO) model predictive control (MPC) configuration is presented for partial differential equation (PDE) unknown spatially-distributed systems (SDSs). First, the dimension reduction with principal component analysis (PCA) is used to transform the high-dimensional spatio-temporal data into a low-dimensional time domain. The MPC strategy is proposed based on the online correction low-dimensional models, where the state of the system at a previous time is used to correct the output of low-dimensional models. Sufficient conditions for closed-loop stability are presented and proven. Simulations demonstrate the accuracy and efficiency of the proposed methodologies.
基金supported by Humanities and Social Science Foundation of Ministry of Education(Grant No.20YJC910003)Natural Science Foundation of Shanghai(Grant No.20ZR1423000)+1 种基金supported by Natural Science Foundation of Beijing(Grant No.Z19J0002)National Natural Science Foundation of China(Grant Nos.11731011 and 11931014)。
文摘We are concerned with partial dimension reduction for the conditional mean function in the presence of controlling variables.We suggest a profile least squares approach to perform partial dimension reduction for a general class of semi-parametric models.The asymptotic properties of the resulting estimates for the central partial mean subspace and the mean function are provided.In addition,a Wald-type test is proposed to evaluate a linear hypothesis of the central partial mean subspace,and a generalized likelihood ratio test is constructed to check whether the nonparametric mean function has a specific parametric form.These tests can be used to evaluate whether there exist interactions between the covariates and the controlling variables,and if any,in what form.A Bayesian information criterion(BIC)-type criterion is applied to determine the structural dimension of the central partial mean subspace.Its consistency is also established.Numerical studies through simulations and real data examples are conducted to demonstrate the power and utility of the proposed semi-parametric approaches.
文摘Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.