The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased si...The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.展开更多
Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretabilit...Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretability of these models presents a persistent challenge.Design/methodology/approach:This paper proposes two innovative dimensionality reduction models based on integer programming(DRMBIP).These models assess compactness through the correlation of each indicator with its class center,while separation is evaluated by the correlation between different class centers.In contrast to DRMBIP-p,the DRMBIP-v considers the threshold parameter as a variable aiming to optimally balances both compactness and separation.Findings:This study,getting data from the Global Health Observatory(GHO),investigates 141 indicators that influence life expectancy.The findings reveal that DRMBIP-p effectively reduces the dimensionality of data,ensuring compactness.It also maintains compatibility with other models.Additionally,DRMBIP-v finds the optimal result,showing exceptional separation.Visualization of the results reveals that all classes have a high compactness.Research limitations:The DRMBIP-p requires the input of the correlation threshold parameter,which plays a pivotal role in the effectiveness of the final dimensionality reduction results.In the DRMBIP-v,modifying the threshold parameter to variable potentially emphasizes either separation or compactness.This necessitates an artificial adjustment to the overflow component within the objective function.Practical implications:The DRMBIP presented in this paper is adept at uncovering the primary geometric structures within high-dimensional indicators.Validated by life expectancy data,this paper demonstrates potential to assist data miners with the reduction of data dimensions.Originality/value:To our knowledge,this is the first time that integer programming has been used to build a dimensionality reduction model with indicator filtering.It not only has applications in life expectancy,but also has obvious advantages in data mining work that requires precise class centers.展开更多
The high dimensionalhyperspectral image classification is a challenging task due to the spectral feature vectors.The high correlation between these features and the noises greatly affects the classification performanc...The high dimensionalhyperspectral image classification is a challenging task due to the spectral feature vectors.The high correlation between these features and the noises greatly affects the classification performances.To overcome this,dimensionality reduction techniques are widely used.Traditional image processing applications recently propose numerous deep learning models.However,in hyperspectral image classification,the features of deep learning models are less explored.Thus,for efficient hyperspectral image classification,a depth-wise convolutional neural network is presented in this research work.To handle the dimensionality issue in the classification process,an optimized self-organized map model is employed using a water strider optimization algorithm.The network parameters of the self-organized map are optimized by the water strider optimization which reduces the dimensionality issues and enhances the classification performances.Standard datasets such as Indian Pines and the University of Pavia(UP)are considered for experimental analysis.Existing dimensionality reduction methods like Enhanced Hybrid-Graph Discriminant Learning(EHGDL),local geometric structure Fisher analysis(LGSFA),Discriminant Hyper-Laplacian projection(DHLP),Group-based tensor model(GBTM),and Lower rank tensor approximation(LRTA)methods are compared with proposed optimized SOM model.Results confirm the superior performance of the proposed model of 98.22%accuracy for the Indian pines dataset and 98.21%accuracy for the University of Pavia dataset over the existing maximum likelihood classifier,and Support vector machine(SVM).展开更多
Tin halide perovskites(THPs)have received extensive attention due to their low toxicity and excellent optoelectronic properties,and are considered to be the most promising alternatives to develop efficient lead-free p...Tin halide perovskites(THPs)have received extensive attention due to their low toxicity and excellent optoelectronic properties,and are considered to be the most promising alternatives to develop efficient lead-free perovskite solar cells.However,due to the unique and inherent characteristics of Sn^(2+)being easily oxidized to Sn^(4+)and fast crystallization,tin perovskite solar cells(TPSCs)show relatively poor performance and stability,compared to the lead counterparts.Recently,the introduction of bulky organic spacers into three-dimensional(3D)THPs for dimensional regulation can not only prevent the intrusion of water and oxygen,but also inhibit the self-doping effect and ion migration.In this review,we will detail how dimensional regulation enables TPSCs with high performance and superior stability.First,we summarize the intrinsic properties of THPs and analyze the root causes of their poor performance and instability.Next,we discuss the specific structure and types of the dimensional regulation strategy.Then,the mechanism of dimensional regulation is discussed in detail,mainly from inhibiting the Sn^(2+)oxidation,optimizing crystallization,passivating defects,and improving energy level alignment.Finally,future challenges and prospects for dimensional regulation are elaborated to help researchers develop more efficient and stable TPSCs.展开更多
In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projec...In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.展开更多
Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on S...Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.展开更多
We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold i...We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold is learned by constructing an approximation for the tangent space at each point, and those tangent spaces are then aligned to give the global coordinates of the data points with respect to the underlying manifold. We also present an error analysis of our algorithm showing that reconstruction errors can be quite small in some cases. We illustrate our algorithm using curves and surfaces both in 2D/3D Euclidean spaces and higher dimensional Euclidean spaces. We also address several theoretical and algorithmic issues for further research and improvements.展开更多
In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the researc...In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification.展开更多
Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient m...Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient method for this purpose. This paper reviews recent advances in this area and related approaches such as multidimensional scaling (MDS), nonlinear PC A, principal manifolds, as well as the connections of the SOM and its recent variant, the visualization induced SOM (ViSOM), with these approaches. The SOM is shown to produce a quantized, qualitative scaling and while the ViSOM a quantitative or metric scaling and approximates principal curve/surface. The SOM can also be regarded as a generalized MDS to relate two metric spaces by forming a topological mapping between them. The relationships among various recently proposed techniques such as ViSOM, Isomap, LLE, and eigenmap are discussed and compared.展开更多
Arc sound is well known as the potential and available resource for monitoring and controlling of the weld penetration status,which is very important to the welding process quality control,so any attentions have been ...Arc sound is well known as the potential and available resource for monitoring and controlling of the weld penetration status,which is very important to the welding process quality control,so any attentions have been paid to the relationships between the arc sound and welding parameters.Some non-linear mapping models correlating the arc sound to welding parameters have been established with the help of neural networks.However,the research of utilizing arc sound to monitor and diagnose welding process is still in its infancy.A self-made real-time sensing system is applied to make a study of arc sound under typical penetration status,including partial penetration,unstable penetration,full penetration and excessive penetration,in metal inert-gas(MIG) flat tailored welding with spray transfer.Arc sound is pretreated by using wavelet de-noising and short-time windowing technologies,and its characteristics,characterizing weld penetration status,of time-domain,frequency-domain,cepstrum-domain and geometric-domain are extracted.Subsequently,high-dimensional eigenvector is constructed and feature-level parameters are successfully fused utilizing the concept of primary principal component analysis(PCA).Ultimately,60-demensional eigenvector is replaced by the synthesis of 8-demensional vector,which achieves compression for feature space and provides technical supports for pattern classification of typical penetration status with the help of arc sound in MIG welding in the future.展开更多
The high dimensions of hyperspectral imagery have caused burden for further processing. A new Fast Independent Component Analysis (FastICA) approach to dimensionality reduction for hyperspectral imagery is presented. ...The high dimensions of hyperspectral imagery have caused burden for further processing. A new Fast Independent Component Analysis (FastICA) approach to dimensionality reduction for hyperspectral imagery is presented. The virtual dimensionality is introduced to determine the number of dimensions needed to be preserved. Since there is no prioritization among independent components generated by the FastICA,the mixing matrix of FastICA is initialized by endmembers,which were extracted by using unsupervised maximum distance method. Minimum Noise Fraction (MNF) is used for preprocessing of original data,which can reduce the computational complexity of FastICA significantly. Finally,FastICA is performed on the selected principal components acquired by MNF to generate the expected independent components in accordance with the order of endmembers. Experimental results demonstrate that the proposed method outperforms second-order statistics-based transforms such as principle components analysis.展开更多
The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descript...The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descriptive way, to measure the stochastic dependency of discrete random variables. The measure method was used as a criterion to reduce high dimensionality of feature vectors in text classification on Web. Feature selections or conversions were performed by using maximum mutual information including linear and non-linear feature conversions. Entropy was used and extended to find right features commendably in pattern recognition systems. Favorable foundation would be established for text classification mining.展开更多
This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
Hyperspectral image(HSI)contains a wealth of spectral information,which makes fine classification of ground objects possible.In the meanwhile,overly redundant information in HSI brings many challenges.Specifically,the...Hyperspectral image(HSI)contains a wealth of spectral information,which makes fine classification of ground objects possible.In the meanwhile,overly redundant information in HSI brings many challenges.Specifically,the lack of training samples and the high computational cost are the inevitable obstacles in the design of classifier.In order to solve these problems,dimensionality reduction is usually adopted.Recently,graph-based dimensionality reduction has become a hot topic.In this paper,the graph-based methods for HSI dimensionality reduction are summarized from the following aspects.1)The traditional graph-based methods employ Euclidean distance to explore the local information of samples in spectral feature space.2)The dimensionality-reduction methods based on sparse or collaborative representation regard the sparse or collaborative coefficients as graph weights to effectively reduce reconstruction errors and represent most important information of HSI in the dictionary.3)Improved methods based on sparse or collaborative graph have made great progress by considering global low-rank information,local intra-class information and spatial information.In order to compare typical techniques,three real HSI datasets were used to carry out relevant experiments,and then the experimental results were analysed and discussed.Finally,the future development of this research field is prospected.展开更多
A micro-electromechanical system(MEMS)scanning mirror accelerates the raster scanning of optical-resolution photoacoustic microscopy(OR-PAM).However,the nonlinear tilt angular-voltage characteristic of a MEMS mirror i...A micro-electromechanical system(MEMS)scanning mirror accelerates the raster scanning of optical-resolution photoacoustic microscopy(OR-PAM).However,the nonlinear tilt angular-voltage characteristic of a MEMS mirror introduces distortion into the maximum back-projection image.Moreover,the size of the airy disk,ultrasonic sensor properties,and thermal effects decrease the resolution.Thus,in this study,we proposed a spatial weight matrix(SWM)with a dimensionality reduction for image reconstruction.The three-layer SWM contains the invariable information of the system,which includes a spatial dependent distortion correction and 3D deconvolution.We employed an ordinal-valued Markov random field and the Harris Stephen algorithm,as well as a modified delay-and-sum method during a time reversal.The results from the experiments and a quantitative analysis demonstrate that images can be effectively reconstructed using an SWM;this is also true for severely distorted images.The index of the mutual information between the reference images and registered images was 70.33 times higher than the initial index,on average.Moreover,the peak signal-to-noise ratio was increased by 17.08%after 3D deconvolution.This accomplishment offers a practical approach to image reconstruction and a promising method to achieve a real-time distortion correction for MEMS-based OR-PAM.展开更多
Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and ...Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.展开更多
Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a...Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a Principal Component Analysis (PCA) of residuals, estimate separate person measures based on the two item sets, compare the two estimates on a person-by-person basis using t-tests and determine the number of cases that differ significantly at the 0.05-level;if ≤5% of tests are significant, or the lower bound of a binomial 95% confidence interval (CI) of the observed proportion overlaps 5%, then it is suggested that strict unidimensionality can be inferred;otherwise the scale is multidimensional. Given its proposed significance and potential implications, this procedure needs detailed scrutiny. This paper explores the impact of sample size and method of estimating the 95% binomial CI upon conclusions according to recommended conventions. Normal approximation, “exact”, Wilson, Agresti-Coull, and Jeffreys binomial CIs were calculated for observed proportions of 0.06, 0.08 and 0.10 and sample sizes from n= 100 to n= 2500. Lower 95%CI boundaries were inspected regarding coverage of the 5% threshold. Results showed that all binomial 95% CIs included as well as excluded 5% as an effect of sample size for all three investigated proportions, except for the Wilson, Agresti-Coull, and JeffreysCIs, which did not include 5% for any sample size with a 10% observed proportion. The normal approximation CI was most sensitive to sample size. These data illustrate that the PCA/t-test protocol should be used and interpreted as any hypothesis testing procedure and is dependent on sample size as well as binomial CI estimation procedure. The PCA/t-test protocol should not be viewed as a “definite” test of unidimensionality and does not replace an integrated quantitative/qualitative interpretation based on an explicit variable definition in view of the perspective, context and purpose of measurement.展开更多
This paper addresses the regularity and finite dimensionality of the global attractor for the plate equation on the unbounded domain. The existence of the attractor in the phase space has been established in an earlie...This paper addresses the regularity and finite dimensionality of the global attractor for the plate equation on the unbounded domain. The existence of the attractor in the phase space has been established in an earlier work of the author. It is shown that the attractor is actually a bounded set of the phase space and has finite fractal dimensionality.展开更多
Multi-label data with high dimensionality often occurs,which will produce large time and energy overheads when directly used in classification tasks.To solve this problem,a novel algorithm called multi-label dimension...Multi-label data with high dimensionality often occurs,which will produce large time and energy overheads when directly used in classification tasks.To solve this problem,a novel algorithm called multi-label dimensionality reduction via semi-supervised discriminant analysis(MSDA) was proposed.It was expected to derive an objective discriminant function as smooth as possible on the data manifold by multi-label learning and semi-supervised learning.By virtue of the latent imformation,which was provided by the graph weighted matrix of sample attributes and the similarity correlation matrix of partial sample labels,MSDA readily made the separability between different classes achieve maximization and estimated the intrinsic geometric structure in the lower manifold space by employing unlabeled data.Extensive experimental results on several real multi-label datasets show that after dimensionality reduction using MSDA,the average classification accuracy is about 9.71% higher than that of other algorithms,and several evaluation metrices like Hamming-loss are also superior to those of other dimensionality reduction methods.展开更多
Low-dimensional materials have excellent properties which are closely related to their dimensionality.However,the growth mechanism underlying tunable dimensionality from 2D triangles to 1D ribbons of such materials is...Low-dimensional materials have excellent properties which are closely related to their dimensionality.However,the growth mechanism underlying tunable dimensionality from 2D triangles to 1D ribbons of such materials is still unrevealed.Here,we establish a general kinetic Monte Carlo model for transition metal dichalcogenides(TMDs) growth to address such an issue.Our model is able to reproduce several key findings in experiments,and reveals that the dimensionality is determined by the lattice mismatch and the interaction strength between TMDs and the substrate.We predict that the dimensionality can be well tuned by the interaction strength and the geometry of the substrate.Our work deepens the understanding of tunable dimensionality of low-dimensional materials and may inspire new concepts for the design of such materials with expected dimensionality.展开更多
基金supported in part by the National Natural Science Foundation of China(NSFC)(92167106,61833014)Key Research and Development Program of Zhejiang Province(2022C01206)。
文摘The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.
基金supported by the National Natural Science Foundation of China (Nos.72371115)the Natural Science Foundation of Jilin,China (No.20230101184JC)。
文摘Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretability of these models presents a persistent challenge.Design/methodology/approach:This paper proposes two innovative dimensionality reduction models based on integer programming(DRMBIP).These models assess compactness through the correlation of each indicator with its class center,while separation is evaluated by the correlation between different class centers.In contrast to DRMBIP-p,the DRMBIP-v considers the threshold parameter as a variable aiming to optimally balances both compactness and separation.Findings:This study,getting data from the Global Health Observatory(GHO),investigates 141 indicators that influence life expectancy.The findings reveal that DRMBIP-p effectively reduces the dimensionality of data,ensuring compactness.It also maintains compatibility with other models.Additionally,DRMBIP-v finds the optimal result,showing exceptional separation.Visualization of the results reveals that all classes have a high compactness.Research limitations:The DRMBIP-p requires the input of the correlation threshold parameter,which plays a pivotal role in the effectiveness of the final dimensionality reduction results.In the DRMBIP-v,modifying the threshold parameter to variable potentially emphasizes either separation or compactness.This necessitates an artificial adjustment to the overflow component within the objective function.Practical implications:The DRMBIP presented in this paper is adept at uncovering the primary geometric structures within high-dimensional indicators.Validated by life expectancy data,this paper demonstrates potential to assist data miners with the reduction of data dimensions.Originality/value:To our knowledge,this is the first time that integer programming has been used to build a dimensionality reduction model with indicator filtering.It not only has applications in life expectancy,but also has obvious advantages in data mining work that requires precise class centers.
文摘The high dimensionalhyperspectral image classification is a challenging task due to the spectral feature vectors.The high correlation between these features and the noises greatly affects the classification performances.To overcome this,dimensionality reduction techniques are widely used.Traditional image processing applications recently propose numerous deep learning models.However,in hyperspectral image classification,the features of deep learning models are less explored.Thus,for efficient hyperspectral image classification,a depth-wise convolutional neural network is presented in this research work.To handle the dimensionality issue in the classification process,an optimized self-organized map model is employed using a water strider optimization algorithm.The network parameters of the self-organized map are optimized by the water strider optimization which reduces the dimensionality issues and enhances the classification performances.Standard datasets such as Indian Pines and the University of Pavia(UP)are considered for experimental analysis.Existing dimensionality reduction methods like Enhanced Hybrid-Graph Discriminant Learning(EHGDL),local geometric structure Fisher analysis(LGSFA),Discriminant Hyper-Laplacian projection(DHLP),Group-based tensor model(GBTM),and Lower rank tensor approximation(LRTA)methods are compared with proposed optimized SOM model.Results confirm the superior performance of the proposed model of 98.22%accuracy for the Indian pines dataset and 98.21%accuracy for the University of Pavia dataset over the existing maximum likelihood classifier,and Support vector machine(SVM).
基金financially supported by the National Natural Science Foundation of China(51702038)the Science&Technology Department of Sichuan Province(2020YFG0061)+2 种基金the Recruitment Program for Young Professionalsthe National Key Research and Development Program of China(2017YFA0206600)the National Natural Science Foundation of China(51773045,21772030,51922032,21961160720)for financial support。
文摘Tin halide perovskites(THPs)have received extensive attention due to their low toxicity and excellent optoelectronic properties,and are considered to be the most promising alternatives to develop efficient lead-free perovskite solar cells.However,due to the unique and inherent characteristics of Sn^(2+)being easily oxidized to Sn^(4+)and fast crystallization,tin perovskite solar cells(TPSCs)show relatively poor performance and stability,compared to the lead counterparts.Recently,the introduction of bulky organic spacers into three-dimensional(3D)THPs for dimensional regulation can not only prevent the intrusion of water and oxygen,but also inhibit the self-doping effect and ion migration.In this review,we will detail how dimensional regulation enables TPSCs with high performance and superior stability.First,we summarize the intrinsic properties of THPs and analyze the root causes of their poor performance and instability.Next,we discuss the specific structure and types of the dimensional regulation strategy.Then,the mechanism of dimensional regulation is discussed in detail,mainly from inhibiting the Sn^(2+)oxidation,optimizing crystallization,passivating defects,and improving energy level alignment.Finally,future challenges and prospects for dimensional regulation are elaborated to help researchers develop more efficient and stable TPSCs.
基金The National Natural Science Foundation of China(No.61231002,61273266)the Ph.D.Program Foundation of Ministry of Education of China(No.20110092130004)China Postdoctoral Science Foundation(No.2015M571637)
文摘In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.
文摘Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.
文摘We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold is learned by constructing an approximation for the tangent space at each point, and those tangent spaces are then aligned to give the global coordinates of the data points with respect to the underlying manifold. We also present an error analysis of our algorithm showing that reconstruction errors can be quite small in some cases. We illustrate our algorithm using curves and surfaces both in 2D/3D Euclidean spaces and higher dimensional Euclidean spaces. We also address several theoretical and algorithmic issues for further research and improvements.
基金supported by the National Natural Science Foundation of China(5110505261173163)the Liaoning Provincial Natural Science Foundation of China(201102037)
文摘In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification.
文摘Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient method for this purpose. This paper reviews recent advances in this area and related approaches such as multidimensional scaling (MDS), nonlinear PC A, principal manifolds, as well as the connections of the SOM and its recent variant, the visualization induced SOM (ViSOM), with these approaches. The SOM is shown to produce a quantized, qualitative scaling and while the ViSOM a quantitative or metric scaling and approximates principal curve/surface. The SOM can also be regarded as a generalized MDS to relate two metric spaces by forming a topological mapping between them. The relationships among various recently proposed techniques such as ViSOM, Isomap, LLE, and eigenmap are discussed and compared.
基金supported by Harbin Academic Pacesetter Foundation of China (Grant No. RC2012XK006002)Zhegjiang Provincial Natural Science Foundation of China (Grant No. Y1110262)+2 种基金Ningbo Municipal Natural Science Foundation of China (Grant No. 2011A610148)Ningbo Municipal Major Industrial Support Project of China (Grant No.2011B1007)Heilongjiang Provincial Natural Science Foundation of China (Grant No. E2007-01)
文摘Arc sound is well known as the potential and available resource for monitoring and controlling of the weld penetration status,which is very important to the welding process quality control,so any attentions have been paid to the relationships between the arc sound and welding parameters.Some non-linear mapping models correlating the arc sound to welding parameters have been established with the help of neural networks.However,the research of utilizing arc sound to monitor and diagnose welding process is still in its infancy.A self-made real-time sensing system is applied to make a study of arc sound under typical penetration status,including partial penetration,unstable penetration,full penetration and excessive penetration,in metal inert-gas(MIG) flat tailored welding with spray transfer.Arc sound is pretreated by using wavelet de-noising and short-time windowing technologies,and its characteristics,characterizing weld penetration status,of time-domain,frequency-domain,cepstrum-domain and geometric-domain are extracted.Subsequently,high-dimensional eigenvector is constructed and feature-level parameters are successfully fused utilizing the concept of primary principal component analysis(PCA).Ultimately,60-demensional eigenvector is replaced by the synthesis of 8-demensional vector,which achieves compression for feature space and provides technical supports for pattern classification of typical penetration status with the help of arc sound in MIG welding in the future.
基金Supported by the National Natural Science Foundation of China (No. 60572135)
文摘The high dimensions of hyperspectral imagery have caused burden for further processing. A new Fast Independent Component Analysis (FastICA) approach to dimensionality reduction for hyperspectral imagery is presented. The virtual dimensionality is introduced to determine the number of dimensions needed to be preserved. Since there is no prioritization among independent components generated by the FastICA,the mixing matrix of FastICA is initialized by endmembers,which were extracted by using unsupervised maximum distance method. Minimum Noise Fraction (MNF) is used for preprocessing of original data,which can reduce the computational complexity of FastICA significantly. Finally,FastICA is performed on the selected principal components acquired by MNF to generate the expected independent components in accordance with the order of endmembers. Experimental results demonstrate that the proposed method outperforms second-order statistics-based transforms such as principle components analysis.
文摘The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descriptive way, to measure the stochastic dependency of discrete random variables. The measure method was used as a criterion to reduce high dimensionality of feature vectors in text classification on Web. Feature selections or conversions were performed by using maximum mutual information including linear and non-linear feature conversions. Entropy was used and extended to find right features commendably in pattern recognition systems. Favorable foundation would be established for text classification mining.
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
基金supported by the National Key Research and Development Project(No.2020YFC1512000)the National Natural Science Foundation of China(No.41601344)+2 种基金the Fundamental Research Funds for the Central Universities(Nos.300102320107 and 201924)in part by the General Projects of Key R&D Programs in Shaanxi Province(No.2020GY-060)Xi’an Science&Technology Project(Nos.2020KJRC0126 and 202018)。
文摘Hyperspectral image(HSI)contains a wealth of spectral information,which makes fine classification of ground objects possible.In the meanwhile,overly redundant information in HSI brings many challenges.Specifically,the lack of training samples and the high computational cost are the inevitable obstacles in the design of classifier.In order to solve these problems,dimensionality reduction is usually adopted.Recently,graph-based dimensionality reduction has become a hot topic.In this paper,the graph-based methods for HSI dimensionality reduction are summarized from the following aspects.1)The traditional graph-based methods employ Euclidean distance to explore the local information of samples in spectral feature space.2)The dimensionality-reduction methods based on sparse or collaborative representation regard the sparse or collaborative coefficients as graph weights to effectively reduce reconstruction errors and represent most important information of HSI in the dictionary.3)Improved methods based on sparse or collaborative graph have made great progress by considering global low-rank information,local intra-class information and spatial information.In order to compare typical techniques,three real HSI datasets were used to carry out relevant experiments,and then the experimental results were analysed and discussed.Finally,the future development of this research field is prospected.
基金supported by National Natural Science Foundation of China,Nos.61822505,11774101,61627827Science and Technology Planning Project of Guangdong Province,No.2015B020233016+2 种基金China Postdoctoral Science Foundation,No.2019 M652943Natural Science Foundation of Guangdong Province,No.2019A1515011399Guangzhou Science and Technology Program key projects,Nos.2019050001.
文摘A micro-electromechanical system(MEMS)scanning mirror accelerates the raster scanning of optical-resolution photoacoustic microscopy(OR-PAM).However,the nonlinear tilt angular-voltage characteristic of a MEMS mirror introduces distortion into the maximum back-projection image.Moreover,the size of the airy disk,ultrasonic sensor properties,and thermal effects decrease the resolution.Thus,in this study,we proposed a spatial weight matrix(SWM)with a dimensionality reduction for image reconstruction.The three-layer SWM contains the invariable information of the system,which includes a spatial dependent distortion correction and 3D deconvolution.We employed an ordinal-valued Markov random field and the Harris Stephen algorithm,as well as a modified delay-and-sum method during a time reversal.The results from the experiments and a quantitative analysis demonstrate that images can be effectively reconstructed using an SWM;this is also true for severely distorted images.The index of the mutual information between the reference images and registered images was 70.33 times higher than the initial index,on average.Moreover,the peak signal-to-noise ratio was increased by 17.08%after 3D deconvolution.This accomplishment offers a practical approach to image reconstruction and a promising method to achieve a real-time distortion correction for MEMS-based OR-PAM.
文摘Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.
文摘Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a Principal Component Analysis (PCA) of residuals, estimate separate person measures based on the two item sets, compare the two estimates on a person-by-person basis using t-tests and determine the number of cases that differ significantly at the 0.05-level;if ≤5% of tests are significant, or the lower bound of a binomial 95% confidence interval (CI) of the observed proportion overlaps 5%, then it is suggested that strict unidimensionality can be inferred;otherwise the scale is multidimensional. Given its proposed significance and potential implications, this procedure needs detailed scrutiny. This paper explores the impact of sample size and method of estimating the 95% binomial CI upon conclusions according to recommended conventions. Normal approximation, “exact”, Wilson, Agresti-Coull, and Jeffreys binomial CIs were calculated for observed proportions of 0.06, 0.08 and 0.10 and sample sizes from n= 100 to n= 2500. Lower 95%CI boundaries were inspected regarding coverage of the 5% threshold. Results showed that all binomial 95% CIs included as well as excluded 5% as an effect of sample size for all three investigated proportions, except for the Wilson, Agresti-Coull, and JeffreysCIs, which did not include 5% for any sample size with a 10% observed proportion. The normal approximation CI was most sensitive to sample size. These data illustrate that the PCA/t-test protocol should be used and interpreted as any hypothesis testing procedure and is dependent on sample size as well as binomial CI estimation procedure. The PCA/t-test protocol should not be viewed as a “definite” test of unidimensionality and does not replace an integrated quantitative/qualitative interpretation based on an explicit variable definition in view of the perspective, context and purpose of measurement.
基金Project Supported by the Scientific Research Fund of Zhejiang Provincial Education Department(No.Y200804289)the Natural Science Foundation of Ningbo City(No.2010A610102)the K.C.Wong Magna Fund in Ningbo University
文摘This paper addresses the regularity and finite dimensionality of the global attractor for the plate equation on the unbounded domain. The existence of the attractor in the phase space has been established in an earlier work of the author. It is shown that the attractor is actually a bounded set of the phase space and has finite fractal dimensionality.
基金Project(60425310) supported by the National Science Fund for Distinguished Young ScholarsProject(10JJ6094) supported by the Hunan Provincial Natural Foundation of China
文摘Multi-label data with high dimensionality often occurs,which will produce large time and energy overheads when directly used in classification tasks.To solve this problem,a novel algorithm called multi-label dimensionality reduction via semi-supervised discriminant analysis(MSDA) was proposed.It was expected to derive an objective discriminant function as smooth as possible on the data manifold by multi-label learning and semi-supervised learning.By virtue of the latent imformation,which was provided by the graph weighted matrix of sample attributes and the similarity correlation matrix of partial sample labels,MSDA readily made the separability between different classes achieve maximization and estimated the intrinsic geometric structure in the lower manifold space by employing unlabeled data.Extensive experimental results on several real multi-label datasets show that after dimensionality reduction using MSDA,the average classification accuracy is about 9.71% higher than that of other algorithms,and several evaluation metrices like Hamming-loss are also superior to those of other dimensionality reduction methods.
基金supported by the Ministry of Science and Technology (No.2018YFA0208702)the National Natural Science Foundation of China (No.32090044,No. 21973085,No.21833007,No.21790350)+1 种基金Anhui Initiative in Quantum Information Technologies (AHY 090200)the Fundamental Research Funds for the Central Universities (WK2340000104)。
文摘Low-dimensional materials have excellent properties which are closely related to their dimensionality.However,the growth mechanism underlying tunable dimensionality from 2D triangles to 1D ribbons of such materials is still unrevealed.Here,we establish a general kinetic Monte Carlo model for transition metal dichalcogenides(TMDs) growth to address such an issue.Our model is able to reproduce several key findings in experiments,and reveals that the dimensionality is determined by the lattice mismatch and the interaction strength between TMDs and the substrate.We predict that the dimensionality can be well tuned by the interaction strength and the geometry of the substrate.Our work deepens the understanding of tunable dimensionality of low-dimensional materials and may inspire new concepts for the design of such materials with expected dimensionality.