In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projec...In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.展开更多
Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on S...Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.展开更多
We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold i...We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold is learned by constructing an approximation for the tangent space at each point, and those tangent spaces are then aligned to give the global coordinates of the data points with respect to the underlying manifold. We also present an error analysis of our algorithm showing that reconstruction errors can be quite small in some cases. We illustrate our algorithm using curves and surfaces both in 2D/3D Euclidean spaces and higher dimensional Euclidean spaces. We also address several theoretical and algorithmic issues for further research and improvements.展开更多
In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the researc...In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification.展开更多
Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient m...Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient method for this purpose. This paper reviews recent advances in this area and related approaches such as multidimensional scaling (MDS), nonlinear PC A, principal manifolds, as well as the connections of the SOM and its recent variant, the visualization induced SOM (ViSOM), with these approaches. The SOM is shown to produce a quantized, qualitative scaling and while the ViSOM a quantitative or metric scaling and approximates principal curve/surface. The SOM can also be regarded as a generalized MDS to relate two metric spaces by forming a topological mapping between them. The relationships among various recently proposed techniques such as ViSOM, Isomap, LLE, and eigenmap are discussed and compared.展开更多
Arc sound is well known as the potential and available resource for monitoring and controlling of the weld penetration status,which is very important to the welding process quality control,so any attentions have been ...Arc sound is well known as the potential and available resource for monitoring and controlling of the weld penetration status,which is very important to the welding process quality control,so any attentions have been paid to the relationships between the arc sound and welding parameters.Some non-linear mapping models correlating the arc sound to welding parameters have been established with the help of neural networks.However,the research of utilizing arc sound to monitor and diagnose welding process is still in its infancy.A self-made real-time sensing system is applied to make a study of arc sound under typical penetration status,including partial penetration,unstable penetration,full penetration and excessive penetration,in metal inert-gas(MIG) flat tailored welding with spray transfer.Arc sound is pretreated by using wavelet de-noising and short-time windowing technologies,and its characteristics,characterizing weld penetration status,of time-domain,frequency-domain,cepstrum-domain and geometric-domain are extracted.Subsequently,high-dimensional eigenvector is constructed and feature-level parameters are successfully fused utilizing the concept of primary principal component analysis(PCA).Ultimately,60-demensional eigenvector is replaced by the synthesis of 8-demensional vector,which achieves compression for feature space and provides technical supports for pattern classification of typical penetration status with the help of arc sound in MIG welding in the future.展开更多
The high dimensions of hyperspectral imagery have caused burden for further processing. A new Fast Independent Component Analysis (FastICA) approach to dimensionality reduction for hyperspectral imagery is presented. ...The high dimensions of hyperspectral imagery have caused burden for further processing. A new Fast Independent Component Analysis (FastICA) approach to dimensionality reduction for hyperspectral imagery is presented. The virtual dimensionality is introduced to determine the number of dimensions needed to be preserved. Since there is no prioritization among independent components generated by the FastICA,the mixing matrix of FastICA is initialized by endmembers,which were extracted by using unsupervised maximum distance method. Minimum Noise Fraction (MNF) is used for preprocessing of original data,which can reduce the computational complexity of FastICA significantly. Finally,FastICA is performed on the selected principal components acquired by MNF to generate the expected independent components in accordance with the order of endmembers. Experimental results demonstrate that the proposed method outperforms second-order statistics-based transforms such as principle components analysis.展开更多
The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descript...The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descriptive way, to measure the stochastic dependency of discrete random variables. The measure method was used as a criterion to reduce high dimensionality of feature vectors in text classification on Web. Feature selections or conversions were performed by using maximum mutual information including linear and non-linear feature conversions. Entropy was used and extended to find right features commendably in pattern recognition systems. Favorable foundation would be established for text classification mining.展开更多
This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
Hyperspectral image(HSI)contains a wealth of spectral information,which makes fine classification of ground objects possible.In the meanwhile,overly redundant information in HSI brings many challenges.Specifically,the...Hyperspectral image(HSI)contains a wealth of spectral information,which makes fine classification of ground objects possible.In the meanwhile,overly redundant information in HSI brings many challenges.Specifically,the lack of training samples and the high computational cost are the inevitable obstacles in the design of classifier.In order to solve these problems,dimensionality reduction is usually adopted.Recently,graph-based dimensionality reduction has become a hot topic.In this paper,the graph-based methods for HSI dimensionality reduction are summarized from the following aspects.1)The traditional graph-based methods employ Euclidean distance to explore the local information of samples in spectral feature space.2)The dimensionality-reduction methods based on sparse or collaborative representation regard the sparse or collaborative coefficients as graph weights to effectively reduce reconstruction errors and represent most important information of HSI in the dictionary.3)Improved methods based on sparse or collaborative graph have made great progress by considering global low-rank information,local intra-class information and spatial information.In order to compare typical techniques,three real HSI datasets were used to carry out relevant experiments,and then the experimental results were analysed and discussed.Finally,the future development of this research field is prospected.展开更多
A micro-electromechanical system(MEMS)scanning mirror accelerates the raster scanning of optical-resolution photoacoustic microscopy(OR-PAM).However,the nonlinear tilt angular-voltage characteristic of a MEMS mirror i...A micro-electromechanical system(MEMS)scanning mirror accelerates the raster scanning of optical-resolution photoacoustic microscopy(OR-PAM).However,the nonlinear tilt angular-voltage characteristic of a MEMS mirror introduces distortion into the maximum back-projection image.Moreover,the size of the airy disk,ultrasonic sensor properties,and thermal effects decrease the resolution.Thus,in this study,we proposed a spatial weight matrix(SWM)with a dimensionality reduction for image reconstruction.The three-layer SWM contains the invariable information of the system,which includes a spatial dependent distortion correction and 3D deconvolution.We employed an ordinal-valued Markov random field and the Harris Stephen algorithm,as well as a modified delay-and-sum method during a time reversal.The results from the experiments and a quantitative analysis demonstrate that images can be effectively reconstructed using an SWM;this is also true for severely distorted images.The index of the mutual information between the reference images and registered images was 70.33 times higher than the initial index,on average.Moreover,the peak signal-to-noise ratio was increased by 17.08%after 3D deconvolution.This accomplishment offers a practical approach to image reconstruction and a promising method to achieve a real-time distortion correction for MEMS-based OR-PAM.展开更多
Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and ...Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.展开更多
Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a...Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a Principal Component Analysis (PCA) of residuals, estimate separate person measures based on the two item sets, compare the two estimates on a person-by-person basis using t-tests and determine the number of cases that differ significantly at the 0.05-level;if ≤5% of tests are significant, or the lower bound of a binomial 95% confidence interval (CI) of the observed proportion overlaps 5%, then it is suggested that strict unidimensionality can be inferred;otherwise the scale is multidimensional. Given its proposed significance and potential implications, this procedure needs detailed scrutiny. This paper explores the impact of sample size and method of estimating the 95% binomial CI upon conclusions according to recommended conventions. Normal approximation, “exact”, Wilson, Agresti-Coull, and Jeffreys binomial CIs were calculated for observed proportions of 0.06, 0.08 and 0.10 and sample sizes from n= 100 to n= 2500. Lower 95%CI boundaries were inspected regarding coverage of the 5% threshold. Results showed that all binomial 95% CIs included as well as excluded 5% as an effect of sample size for all three investigated proportions, except for the Wilson, Agresti-Coull, and JeffreysCIs, which did not include 5% for any sample size with a 10% observed proportion. The normal approximation CI was most sensitive to sample size. These data illustrate that the PCA/t-test protocol should be used and interpreted as any hypothesis testing procedure and is dependent on sample size as well as binomial CI estimation procedure. The PCA/t-test protocol should not be viewed as a “definite” test of unidimensionality and does not replace an integrated quantitative/qualitative interpretation based on an explicit variable definition in view of the perspective, context and purpose of measurement.展开更多
This paper addresses the regularity and finite dimensionality of the global attractor for the plate equation on the unbounded domain. The existence of the attractor in the phase space has been established in an earlie...This paper addresses the regularity and finite dimensionality of the global attractor for the plate equation on the unbounded domain. The existence of the attractor in the phase space has been established in an earlier work of the author. It is shown that the attractor is actually a bounded set of the phase space and has finite fractal dimensionality.展开更多
Multi-label data with high dimensionality often occurs,which will produce large time and energy overheads when directly used in classification tasks.To solve this problem,a novel algorithm called multi-label dimension...Multi-label data with high dimensionality often occurs,which will produce large time and energy overheads when directly used in classification tasks.To solve this problem,a novel algorithm called multi-label dimensionality reduction via semi-supervised discriminant analysis(MSDA) was proposed.It was expected to derive an objective discriminant function as smooth as possible on the data manifold by multi-label learning and semi-supervised learning.By virtue of the latent imformation,which was provided by the graph weighted matrix of sample attributes and the similarity correlation matrix of partial sample labels,MSDA readily made the separability between different classes achieve maximization and estimated the intrinsic geometric structure in the lower manifold space by employing unlabeled data.Extensive experimental results on several real multi-label datasets show that after dimensionality reduction using MSDA,the average classification accuracy is about 9.71% higher than that of other algorithms,and several evaluation metrices like Hamming-loss are also superior to those of other dimensionality reduction methods.展开更多
Low-dimensional materials have excellent properties which are closely related to their dimensionality.However,the growth mechanism underlying tunable dimensionality from 2D triangles to 1D ribbons of such materials is...Low-dimensional materials have excellent properties which are closely related to their dimensionality.However,the growth mechanism underlying tunable dimensionality from 2D triangles to 1D ribbons of such materials is still unrevealed.Here,we establish a general kinetic Monte Carlo model for transition metal dichalcogenides(TMDs) growth to address such an issue.Our model is able to reproduce several key findings in experiments,and reveals that the dimensionality is determined by the lattice mismatch and the interaction strength between TMDs and the substrate.We predict that the dimensionality can be well tuned by the interaction strength and the geometry of the substrate.Our work deepens the understanding of tunable dimensionality of low-dimensional materials and may inspire new concepts for the design of such materials with expected dimensionality.展开更多
Big data is a vast amount of structured and unstructured data that must be dealt with on a regular basis.Dimensionality reduction is the process of converting a huge set of data into data with tiny dimensions so that ...Big data is a vast amount of structured and unstructured data that must be dealt with on a regular basis.Dimensionality reduction is the process of converting a huge set of data into data with tiny dimensions so that equal information may be expressed easily.These tactics are frequently utilized to improve classification or regression challenges while dealing with machine learning issues.To achieve dimensionality reduction for huge data sets,this paper offers a hybrid particle swarm optimization-rough set PSO-RS and Mayfly algorithm-rough set MA-RS.A novel hybrid strategy based on the Mayfly algorithm(MA)and the rough set(RS)is proposed in particular.The performance of the novel hybrid algorithm MA-RS is evaluated by solving six different data sets from the literature.The simulation results and comparison with common reduction methods demonstrate the proposed MARS algorithm’s capacity to handle a wide range of data sets.Finally,the rough set approach,as well as the hybrid optimization techniques PSO-RS and MARS,were applied to deal with the massive data problem.MA-hybrid RS’s method beats other classic dimensionality reduction techniques,according to the experimental results and statistical testing studies.展开更多
Computer gaming is one of the most common activities that individuals are indulged in their usual activities concerning interactive systembased entertainment.Visuospatial processing is an essential aspect of mental ro...Computer gaming is one of the most common activities that individuals are indulged in their usual activities concerning interactive systembased entertainment.Visuospatial processing is an essential aspect of mental rotation(MR)in playing computer-games.Previous studies have explored how objects’features affect theMRprocess;however,non-isomorphic 2Dand 3D objects lack a fair comparison.In addition,the effects of these features on brain activation during the MR in computer-games have been less investigated.This study investigates how dimensionality and angular disparity affect brain activation duringMRin computer-games.EEG(electroencephalogram)data were recorded from sixty healthy adults while playing an MR-based computer game.Isomorphic 2D and 3D visual objects with convex and reflex angular disparity were presented in the game.Cluster-based permutation tests were applied on EEG spectral power for frequency range 3.5–30 Hz to identify significant spatio-spectral changes.Also,the band-specific hemispheric lateralization was evaluated to investigate task-specific asymmetry.The results indicated higher alpha desynchronization in the left hemisphere during MR compared to baseline.The fronto-parietal areas showed neural activations during the game with convex angular disparities and 3D objects,for a frequency range of 7.8–14.2Hz and 7.8–10.5Hz,respectively.These areas also showed activations during the game with reflex angular disparities and 2D objects,but for narrower frequency bands,i.e.,8.0–10.0 Hz and 11.0–11.7 Hz,respectively.Left hemispheric dominance was observed for alpha and beta frequencies.However,the right parietal region was notably more dominant for convex angular disparity and 3D objects.Overall,the results showed higher neural activities elicited by convex angular disparities and 3D objects in the game compared to the reflex angles and 2Dobjects.The findings suggest future applications,such as cognitive modeling and controlled MR training using computer games.展开更多
The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased si...The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.展开更多
Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretabilit...Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretability of these models presents a persistent challenge.Design/methodology/approach:This paper proposes two innovative dimensionality reduction models based on integer programming(DRMBIP).These models assess compactness through the correlation of each indicator with its class center,while separation is evaluated by the correlation between different class centers.In contrast to DRMBIP-p,the DRMBIP-v considers the threshold parameter as a variable aiming to optimally balances both compactness and separation.Findings:This study,getting data from the Global Health Observatory(GHO),investigates 141 indicators that influence life expectancy.The findings reveal that DRMBIP-p effectively reduces the dimensionality of data,ensuring compactness.It also maintains compatibility with other models.Additionally,DRMBIP-v finds the optimal result,showing exceptional separation.Visualization of the results reveals that all classes have a high compactness.Research limitations:The DRMBIP-p requires the input of the correlation threshold parameter,which plays a pivotal role in the effectiveness of the final dimensionality reduction results.In the DRMBIP-v,modifying the threshold parameter to variable potentially emphasizes either separation or compactness.This necessitates an artificial adjustment to the overflow component within the objective function.Practical implications:The DRMBIP presented in this paper is adept at uncovering the primary geometric structures within high-dimensional indicators.Validated by life expectancy data,this paper demonstrates potential to assist data miners with the reduction of data dimensions.Originality/value:To our knowledge,this is the first time that integer programming has been used to build a dimensionality reduction model with indicator filtering.It not only has applications in life expectancy,but also has obvious advantages in data mining work that requires precise class centers.展开更多
基金The National Natural Science Foundation of China(No.61231002,61273266)the Ph.D.Program Foundation of Ministry of Education of China(No.20110092130004)China Postdoctoral Science Foundation(No.2015M571637)
文摘In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.
文摘Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.
文摘We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold is learned by constructing an approximation for the tangent space at each point, and those tangent spaces are then aligned to give the global coordinates of the data points with respect to the underlying manifold. We also present an error analysis of our algorithm showing that reconstruction errors can be quite small in some cases. We illustrate our algorithm using curves and surfaces both in 2D/3D Euclidean spaces and higher dimensional Euclidean spaces. We also address several theoretical and algorithmic issues for further research and improvements.
基金supported by the National Natural Science Foundation of China(5110505261173163)the Liaoning Provincial Natural Science Foundation of China(201102037)
文摘In the need of some real applications, such as text categorization and image classification, the multi-label learning gradually becomes a hot research point in recent years. Much attention has been paid to the research of multi-label classification algorithms. Considering the fact that the high dimensionality of the multi-label datasets may cause the curse of dimensionality and wil hamper the classification process, a dimensionality reduction algorithm, named multi-label kernel discriminant analysis (MLKDA), is proposed to reduce the dimensionality of multi-label datasets. MLKDA, with the kernel trick, processes the multi-label integrally and realizes the nonlinear dimensionality reduction with the idea similar with linear discriminant analysis (LDA). In the classification process of multi-label data, the extreme learning machine (ELM) is an efficient algorithm in the premise of good accuracy. MLKDA, combined with ELM, shows a good performance in multi-label learning experiments with several datasets. The experiments on both static data and data stream show that MLKDA outperforms multi-label dimensionality reduction via dependence maximization (MDDM) and multi-label linear discriminant analysis (MLDA) in cases of balanced datasets and stronger correlation between tags, and ELM is also a good choice for multi-label classification.
文摘Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient method for this purpose. This paper reviews recent advances in this area and related approaches such as multidimensional scaling (MDS), nonlinear PC A, principal manifolds, as well as the connections of the SOM and its recent variant, the visualization induced SOM (ViSOM), with these approaches. The SOM is shown to produce a quantized, qualitative scaling and while the ViSOM a quantitative or metric scaling and approximates principal curve/surface. The SOM can also be regarded as a generalized MDS to relate two metric spaces by forming a topological mapping between them. The relationships among various recently proposed techniques such as ViSOM, Isomap, LLE, and eigenmap are discussed and compared.
基金supported by Harbin Academic Pacesetter Foundation of China (Grant No. RC2012XK006002)Zhegjiang Provincial Natural Science Foundation of China (Grant No. Y1110262)+2 种基金Ningbo Municipal Natural Science Foundation of China (Grant No. 2011A610148)Ningbo Municipal Major Industrial Support Project of China (Grant No.2011B1007)Heilongjiang Provincial Natural Science Foundation of China (Grant No. E2007-01)
文摘Arc sound is well known as the potential and available resource for monitoring and controlling of the weld penetration status,which is very important to the welding process quality control,so any attentions have been paid to the relationships between the arc sound and welding parameters.Some non-linear mapping models correlating the arc sound to welding parameters have been established with the help of neural networks.However,the research of utilizing arc sound to monitor and diagnose welding process is still in its infancy.A self-made real-time sensing system is applied to make a study of arc sound under typical penetration status,including partial penetration,unstable penetration,full penetration and excessive penetration,in metal inert-gas(MIG) flat tailored welding with spray transfer.Arc sound is pretreated by using wavelet de-noising and short-time windowing technologies,and its characteristics,characterizing weld penetration status,of time-domain,frequency-domain,cepstrum-domain and geometric-domain are extracted.Subsequently,high-dimensional eigenvector is constructed and feature-level parameters are successfully fused utilizing the concept of primary principal component analysis(PCA).Ultimately,60-demensional eigenvector is replaced by the synthesis of 8-demensional vector,which achieves compression for feature space and provides technical supports for pattern classification of typical penetration status with the help of arc sound in MIG welding in the future.
基金Supported by the National Natural Science Foundation of China (No. 60572135)
文摘The high dimensions of hyperspectral imagery have caused burden for further processing. A new Fast Independent Component Analysis (FastICA) approach to dimensionality reduction for hyperspectral imagery is presented. The virtual dimensionality is introduced to determine the number of dimensions needed to be preserved. Since there is no prioritization among independent components generated by the FastICA,the mixing matrix of FastICA is initialized by endmembers,which were extracted by using unsupervised maximum distance method. Minimum Noise Fraction (MNF) is used for preprocessing of original data,which can reduce the computational complexity of FastICA significantly. Finally,FastICA is performed on the selected principal components acquired by MNF to generate the expected independent components in accordance with the order of endmembers. Experimental results demonstrate that the proposed method outperforms second-order statistics-based transforms such as principle components analysis.
文摘The frame of text classification system was presented. The high dimensionality in feature space for text classification was studied. The mutual information is a widely used information theoretic measure, in a descriptive way, to measure the stochastic dependency of discrete random variables. The measure method was used as a criterion to reduce high dimensionality of feature vectors in text classification on Web. Feature selections or conversions were performed by using maximum mutual information including linear and non-linear feature conversions. Entropy was used and extended to find right features commendably in pattern recognition systems. Favorable foundation would be established for text classification mining.
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
基金supported by the National Key Research and Development Project(No.2020YFC1512000)the National Natural Science Foundation of China(No.41601344)+2 种基金the Fundamental Research Funds for the Central Universities(Nos.300102320107 and 201924)in part by the General Projects of Key R&D Programs in Shaanxi Province(No.2020GY-060)Xi’an Science&Technology Project(Nos.2020KJRC0126 and 202018)。
文摘Hyperspectral image(HSI)contains a wealth of spectral information,which makes fine classification of ground objects possible.In the meanwhile,overly redundant information in HSI brings many challenges.Specifically,the lack of training samples and the high computational cost are the inevitable obstacles in the design of classifier.In order to solve these problems,dimensionality reduction is usually adopted.Recently,graph-based dimensionality reduction has become a hot topic.In this paper,the graph-based methods for HSI dimensionality reduction are summarized from the following aspects.1)The traditional graph-based methods employ Euclidean distance to explore the local information of samples in spectral feature space.2)The dimensionality-reduction methods based on sparse or collaborative representation regard the sparse or collaborative coefficients as graph weights to effectively reduce reconstruction errors and represent most important information of HSI in the dictionary.3)Improved methods based on sparse or collaborative graph have made great progress by considering global low-rank information,local intra-class information and spatial information.In order to compare typical techniques,three real HSI datasets were used to carry out relevant experiments,and then the experimental results were analysed and discussed.Finally,the future development of this research field is prospected.
基金supported by National Natural Science Foundation of China,Nos.61822505,11774101,61627827Science and Technology Planning Project of Guangdong Province,No.2015B020233016+2 种基金China Postdoctoral Science Foundation,No.2019 M652943Natural Science Foundation of Guangdong Province,No.2019A1515011399Guangzhou Science and Technology Program key projects,Nos.2019050001.
文摘A micro-electromechanical system(MEMS)scanning mirror accelerates the raster scanning of optical-resolution photoacoustic microscopy(OR-PAM).However,the nonlinear tilt angular-voltage characteristic of a MEMS mirror introduces distortion into the maximum back-projection image.Moreover,the size of the airy disk,ultrasonic sensor properties,and thermal effects decrease the resolution.Thus,in this study,we proposed a spatial weight matrix(SWM)with a dimensionality reduction for image reconstruction.The three-layer SWM contains the invariable information of the system,which includes a spatial dependent distortion correction and 3D deconvolution.We employed an ordinal-valued Markov random field and the Harris Stephen algorithm,as well as a modified delay-and-sum method during a time reversal.The results from the experiments and a quantitative analysis demonstrate that images can be effectively reconstructed using an SWM;this is also true for severely distorted images.The index of the mutual information between the reference images and registered images was 70.33 times higher than the initial index,on average.Moreover,the peak signal-to-noise ratio was increased by 17.08%after 3D deconvolution.This accomplishment offers a practical approach to image reconstruction and a promising method to achieve a real-time distortion correction for MEMS-based OR-PAM.
文摘Dimension reduction is defined as the processes of projecting high-dimensional data to a much lower-dimensional space. Dimension reduction methods variously applied in regression, classification, feature analysis and visualization. In this paper, we review in details the last and most new version of methods that extensively developed in the past decade.
文摘Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a Principal Component Analysis (PCA) of residuals, estimate separate person measures based on the two item sets, compare the two estimates on a person-by-person basis using t-tests and determine the number of cases that differ significantly at the 0.05-level;if ≤5% of tests are significant, or the lower bound of a binomial 95% confidence interval (CI) of the observed proportion overlaps 5%, then it is suggested that strict unidimensionality can be inferred;otherwise the scale is multidimensional. Given its proposed significance and potential implications, this procedure needs detailed scrutiny. This paper explores the impact of sample size and method of estimating the 95% binomial CI upon conclusions according to recommended conventions. Normal approximation, “exact”, Wilson, Agresti-Coull, and Jeffreys binomial CIs were calculated for observed proportions of 0.06, 0.08 and 0.10 and sample sizes from n= 100 to n= 2500. Lower 95%CI boundaries were inspected regarding coverage of the 5% threshold. Results showed that all binomial 95% CIs included as well as excluded 5% as an effect of sample size for all three investigated proportions, except for the Wilson, Agresti-Coull, and JeffreysCIs, which did not include 5% for any sample size with a 10% observed proportion. The normal approximation CI was most sensitive to sample size. These data illustrate that the PCA/t-test protocol should be used and interpreted as any hypothesis testing procedure and is dependent on sample size as well as binomial CI estimation procedure. The PCA/t-test protocol should not be viewed as a “definite” test of unidimensionality and does not replace an integrated quantitative/qualitative interpretation based on an explicit variable definition in view of the perspective, context and purpose of measurement.
基金Project Supported by the Scientific Research Fund of Zhejiang Provincial Education Department(No.Y200804289)the Natural Science Foundation of Ningbo City(No.2010A610102)the K.C.Wong Magna Fund in Ningbo University
文摘This paper addresses the regularity and finite dimensionality of the global attractor for the plate equation on the unbounded domain. The existence of the attractor in the phase space has been established in an earlier work of the author. It is shown that the attractor is actually a bounded set of the phase space and has finite fractal dimensionality.
基金Project(60425310) supported by the National Science Fund for Distinguished Young ScholarsProject(10JJ6094) supported by the Hunan Provincial Natural Foundation of China
文摘Multi-label data with high dimensionality often occurs,which will produce large time and energy overheads when directly used in classification tasks.To solve this problem,a novel algorithm called multi-label dimensionality reduction via semi-supervised discriminant analysis(MSDA) was proposed.It was expected to derive an objective discriminant function as smooth as possible on the data manifold by multi-label learning and semi-supervised learning.By virtue of the latent imformation,which was provided by the graph weighted matrix of sample attributes and the similarity correlation matrix of partial sample labels,MSDA readily made the separability between different classes achieve maximization and estimated the intrinsic geometric structure in the lower manifold space by employing unlabeled data.Extensive experimental results on several real multi-label datasets show that after dimensionality reduction using MSDA,the average classification accuracy is about 9.71% higher than that of other algorithms,and several evaluation metrices like Hamming-loss are also superior to those of other dimensionality reduction methods.
基金supported by the Ministry of Science and Technology (No.2018YFA0208702)the National Natural Science Foundation of China (No.32090044,No. 21973085,No.21833007,No.21790350)+1 种基金Anhui Initiative in Quantum Information Technologies (AHY 090200)the Fundamental Research Funds for the Central Universities (WK2340000104)。
文摘Low-dimensional materials have excellent properties which are closely related to their dimensionality.However,the growth mechanism underlying tunable dimensionality from 2D triangles to 1D ribbons of such materials is still unrevealed.Here,we establish a general kinetic Monte Carlo model for transition metal dichalcogenides(TMDs) growth to address such an issue.Our model is able to reproduce several key findings in experiments,and reveals that the dimensionality is determined by the lattice mismatch and the interaction strength between TMDs and the substrate.We predict that the dimensionality can be well tuned by the interaction strength and the geometry of the substrate.Our work deepens the understanding of tunable dimensionality of low-dimensional materials and may inspire new concepts for the design of such materials with expected dimensionality.
文摘Big data is a vast amount of structured and unstructured data that must be dealt with on a regular basis.Dimensionality reduction is the process of converting a huge set of data into data with tiny dimensions so that equal information may be expressed easily.These tactics are frequently utilized to improve classification or regression challenges while dealing with machine learning issues.To achieve dimensionality reduction for huge data sets,this paper offers a hybrid particle swarm optimization-rough set PSO-RS and Mayfly algorithm-rough set MA-RS.A novel hybrid strategy based on the Mayfly algorithm(MA)and the rough set(RS)is proposed in particular.The performance of the novel hybrid algorithm MA-RS is evaluated by solving six different data sets from the literature.The simulation results and comparison with common reduction methods demonstrate the proposed MARS algorithm’s capacity to handle a wide range of data sets.Finally,the rough set approach,as well as the hybrid optimization techniques PSO-RS and MARS,were applied to deal with the massive data problem.MA-hybrid RS’s method beats other classic dimensionality reduction techniques,according to the experimental results and statistical testing studies.
基金A.T.was supported by DST-INSPIRE Program sponsored by Department of Science&Technology,Government of India,Fellowship ID:IF150444,URL:https://www.onlineinspire.gov.in/.
文摘Computer gaming is one of the most common activities that individuals are indulged in their usual activities concerning interactive systembased entertainment.Visuospatial processing is an essential aspect of mental rotation(MR)in playing computer-games.Previous studies have explored how objects’features affect theMRprocess;however,non-isomorphic 2Dand 3D objects lack a fair comparison.In addition,the effects of these features on brain activation during the MR in computer-games have been less investigated.This study investigates how dimensionality and angular disparity affect brain activation duringMRin computer-games.EEG(electroencephalogram)data were recorded from sixty healthy adults while playing an MR-based computer game.Isomorphic 2D and 3D visual objects with convex and reflex angular disparity were presented in the game.Cluster-based permutation tests were applied on EEG spectral power for frequency range 3.5–30 Hz to identify significant spatio-spectral changes.Also,the band-specific hemispheric lateralization was evaluated to investigate task-specific asymmetry.The results indicated higher alpha desynchronization in the left hemisphere during MR compared to baseline.The fronto-parietal areas showed neural activations during the game with convex angular disparities and 3D objects,for a frequency range of 7.8–14.2Hz and 7.8–10.5Hz,respectively.These areas also showed activations during the game with reflex angular disparities and 2D objects,but for narrower frequency bands,i.e.,8.0–10.0 Hz and 11.0–11.7 Hz,respectively.Left hemispheric dominance was observed for alpha and beta frequencies.However,the right parietal region was notably more dominant for convex angular disparity and 3D objects.Overall,the results showed higher neural activities elicited by convex angular disparities and 3D objects in the game compared to the reflex angles and 2Dobjects.The findings suggest future applications,such as cognitive modeling and controlled MR training using computer games.
基金supported in part by the National Natural Science Foundation of China(NSFC)(92167106,61833014)Key Research and Development Program of Zhejiang Province(2022C01206)。
文摘The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.
基金supported by the National Natural Science Foundation of China (Nos.72371115)the Natural Science Foundation of Jilin,China (No.20230101184JC)。
文摘Purpose:Exploring a dimensionality reduction model that can adeptly eliminate outliers and select the appropriate number of clusters is of profound theoretical and practical importance.Additionally,the interpretability of these models presents a persistent challenge.Design/methodology/approach:This paper proposes two innovative dimensionality reduction models based on integer programming(DRMBIP).These models assess compactness through the correlation of each indicator with its class center,while separation is evaluated by the correlation between different class centers.In contrast to DRMBIP-p,the DRMBIP-v considers the threshold parameter as a variable aiming to optimally balances both compactness and separation.Findings:This study,getting data from the Global Health Observatory(GHO),investigates 141 indicators that influence life expectancy.The findings reveal that DRMBIP-p effectively reduces the dimensionality of data,ensuring compactness.It also maintains compatibility with other models.Additionally,DRMBIP-v finds the optimal result,showing exceptional separation.Visualization of the results reveals that all classes have a high compactness.Research limitations:The DRMBIP-p requires the input of the correlation threshold parameter,which plays a pivotal role in the effectiveness of the final dimensionality reduction results.In the DRMBIP-v,modifying the threshold parameter to variable potentially emphasizes either separation or compactness.This necessitates an artificial adjustment to the overflow component within the objective function.Practical implications:The DRMBIP presented in this paper is adept at uncovering the primary geometric structures within high-dimensional indicators.Validated by life expectancy data,this paper demonstrates potential to assist data miners with the reduction of data dimensions.Originality/value:To our knowledge,this is the first time that integer programming has been used to build a dimensionality reduction model with indicator filtering.It not only has applications in life expectancy,but also has obvious advantages in data mining work that requires precise class centers.