Software defect prediction plays a very important role in software quality assurance,which aims to inspect as many potentially defect-prone software modules as possible.However,the performance of the prediction model ...Software defect prediction plays a very important role in software quality assurance,which aims to inspect as many potentially defect-prone software modules as possible.However,the performance of the prediction model is susceptible to high dimensionality of the dataset that contains irrelevant and redundant features.In addition,software metrics for software defect prediction are almost entirely traditional features compared to the deep semantic feature representation from deep learning techniques.To address these two issues,we propose the following two solutions in this paper:(1)We leverage a novel non-linear manifold learning method-SOINN Landmark Isomap(SL-Isomap)to extract the representative features by selecting automatically the reasonable number and position of landmarks,which can reveal the complex intrinsic structure hidden behind the defect data.(2)We propose a novel defect prediction model named DLDD based on hybrid deep learning techniques,which leverages denoising autoencoder to learn true input features that are not contaminated by noise,and utilizes deep neural network to learn the abstract deep semantic features.We combine the squared error loss function of denoising autoencoder with the cross entropy loss function of deep neural network to achieve the best prediction performance by adjusting a hyperparameter.We compare the SL-Isomap with seven state-of-the-art feature extraction methods and compare the DLDD model with six baseline models across 20 open source software projects.The experimental results verify that the superiority of SL-Isomap and DLDD on four evaluation indicators.展开更多
Recently manifold learning algorithm for dimensionality reduction attracts more and more interests, and various linear and nonlinear,global and local algorithms are proposed. The key step of manifold learning algorith...Recently manifold learning algorithm for dimensionality reduction attracts more and more interests, and various linear and nonlinear,global and local algorithms are proposed. The key step of manifold learning algorithm is the neighboring region selection. However,so far for the references we know,few of which propose a generally accepted algorithm to well select the neighboring region. So in this paper,we propose an adaptive neighboring selection algorithm,which successfully applies the LLE and ISOMAP algorithms in the test. It is an algorithm that can find the optimal K nearest neighbors of the data points on the manifold. And the theoretical basis of the algorithm is the approximated curvature of the data point on the manifold. Based on Riemann Geometry,Jacob matrix is a proper mathematical concept to predict the approximated curvature. By verifying the proposed algorithm on embedding Swiss roll from R3 to R2 based on LLE and ISOMAP algorithm,the simulation results show that the proposed adaptive neighboring selection algorithm is feasible and able to find the optimal value of K,making the residual variance relatively small and better visualization of the results. By quantitative analysis,the embedding quality measured by residual variance is increased 45. 45% after using the proposed algorithm in LLE.展开更多
In view of the incremental learning problem of manifold learning algorithm, an adaptive neighborhood incremental principal component analysis(PCA) and locality preserving projection(LPP) manifold learning algorithm is...In view of the incremental learning problem of manifold learning algorithm, an adaptive neighborhood incremental principal component analysis(PCA) and locality preserving projection(LPP) manifold learning algorithm is presented, and the incremental learning principle of algorithm is introduced. For incremental sample data, the adjacency and covariance matrices are incrementally updated by the existing samples; then the dimensionality reduction results of the incremental samples are estimated by the dimensionality reduction results of the existing samples; finally, the dimensionality reduction results of the incremental and existing samples are updated by subspace iteration method. The adaptive neighborhood incremental PCA-LPP manifold learning algorithm is applied to processing of gearbox fault signals. The dimensionality reduction results by incremental learning have very small error, compared with those by batch learning. Spatial aggregation of the incremental samples is basically stable, and fault identification rate is increased.展开更多
In this paper,we propose an Unsupervised Nonlinear Adaptive Manifold Learning method(UNAML)that considers both global and local information.In this approach,we apply unlabeled training samples to study nonlinear manif...In this paper,we propose an Unsupervised Nonlinear Adaptive Manifold Learning method(UNAML)that considers both global and local information.In this approach,we apply unlabeled training samples to study nonlinear manifold features,while considering global pairwise distances and maintaining local topology structure.Our method aims at minimizing global pairwise data distance errors as well as local structural errors.In order to enable our UNAML to be more efficient and to extract manifold features from the external source of new data,we add a feature approximate error that can be used to learn a linear extractor.Also,we add a feature approximate error that can be used to learn a linear extractor.In addition,we use a method of adaptive neighbor selection to calculate local structural errors.This paper uses the kernel matrix method to optimize the original algorithm.Our algorithm proves to be more effective when compared with the experimental results of other feature extraction methods on real face-data sets and object data sets.展开更多
Data-driven computing in elasticity attempts to directly use experimental data on material,without constructing an empirical model of the constitutive relation,to predict an equilibrium state of a structure subjected ...Data-driven computing in elasticity attempts to directly use experimental data on material,without constructing an empirical model of the constitutive relation,to predict an equilibrium state of a structure subjected to a specified external load.Provided that a data set comprising stress-strain pairs of material is available,a data-driven method using the kernel method and the regularized least-squares was developed to extract a manifold on which the points in the data set approximately lie(Kanno 2021,Jpn.J.Ind.Appl.Math.).From the perspective of physical experiments,stress field cannot be directly measured,while displacement and force fields are measurable.In this study,we extend the previous kernel method to the situation that pairs of displacement and force,instead of pairs of stress and strain,are available as an input data set.A new regularized least-squares problem is formulated in this problem setting,and an alternating minimization algorithm is proposed to solve the problem.展开更多
Purpose-Isometric feature mapping(Isomap)is a very popular manifold learning method and is widely used in dimensionality reduction and data visualization.The most time-consuming step in Isomap is to compute the shorte...Purpose-Isometric feature mapping(Isomap)is a very popular manifold learning method and is widely used in dimensionality reduction and data visualization.The most time-consuming step in Isomap is to compute the shortest paths between all pairs of data points based on a neighbourhood graph.The classical Isomap(C-Isomap)is very slow,due to the use of Floyd’s algorithm to compute the shortest paths.The purpose of this paper is to speed up Isomap.Design/methodology/approach-Through theoretical analysis,it is found that the neighbourhood graph in Isomap is sparse.In this case,the Dijkstra’s algorithm with Fibonacci heap(Fib-Dij)is faster than Floyd’s algorithm.In this paper,an improved Isomap method based on Fib-Dij is proposed.By using Fib-Dij to replace Floyd’s algorithm,an improved Isomap method is presented in this paper.Findings-Using the S-curve,the Swiss-roll,the Frey face database,the mixed national institute of standards and technology database of handwritten digits and a face image database,the performance of the proposed method is compared with C-Isomap,showing the consistency with C-Isomap and marked improvements in terms of the high speed.Simulations also demonstrate that Fib-Dij reduces the computation time of the shortest paths from O(N3)to O(N2lgN).Research limitations/implications-Due to the limitations of the computer,the sizes of the data sets in this paper are all smaller than 3,000.Therefore,researchers are encouraged to test the proposed algorithm on larger data sets.Originality/value-The new method based on Fib-Dij can greatly improve the speed of Isomap.展开更多
As modern weapons and equipment undergo increasing levels of informatization,intelligence,and networking,the topology and traffic characteristics of battlefield data networks built with tactical data links are becomin...As modern weapons and equipment undergo increasing levels of informatization,intelligence,and networking,the topology and traffic characteristics of battlefield data networks built with tactical data links are becoming progressively complex.In this paper,we employ a traffic matrix to model the tactical data link network.We propose a method that utilizes the Maximum Variance Unfolding(MVU)algorithm to conduct nonlinear dimensionality reduction analysis on high-dimensional open network traffic matrix datasets.This approach introduces novel ideas and methods for future applications,including traffic prediction and anomaly analysis in real battlefield network environments.展开更多
Variational mode decomposition(VMD) has been proved to be useful for extraction of fault-induced transients of rolling bearings. Multi-bandwidth mode manifold(Triple M, TM) is one variation of the VMD, which units mul...Variational mode decomposition(VMD) has been proved to be useful for extraction of fault-induced transients of rolling bearings. Multi-bandwidth mode manifold(Triple M, TM) is one variation of the VMD, which units multiple fault-related modes with different bandwidths by a nonlinear manifold learning algorithm named local tangent space alignment(LTSA). The merit of the TM method is that the bearing fault-induced transients extracted contain low level of in-band noise without optimization of the VMD parameters. However, the determination of the neighborhood size of the LTSA is time-consuming, and the extracted fault-induced transients may have the problem of asymmetry in the up-and-down direction. This paper aims to improve the efficiency and waveform symmetry of the TM method.Specifically, the multi-bandwidth modes consisting of the fault-related modes with different bandwidths are first obtained by repeating the recycling VMD(RVMD) method with different bandwidth balance parameters. Then, the LTSA algorithm is performed on the multi-bandwidth modes to extract their inherent manifold structure, in which the natural nearest neighbor(Triple N, TN) algorithm is adopted to efficiently and reasonably select the neighbors of each data point in the multi-bandwidth modes. Finally, a weight-based feature compensation strategy is designed to synthesize the low-dimensional manifold features to alleviate the asymmetry problem, resulting in a symmetric TM feature that can represent the real fault transient components. The major contribution of the improved TM method for bearing fault diagnosis is that the pure fault-induced transients are extracted efficiently and are symmetrical as the real. One simulation analysis and two experimental applications in bearing fault diagnosis validate the enhanced performance of the improved TM method over the traditional methods. This research proposes a bearing fault diagnosis method which has the advantages of high efficiency, good waveform symmetry and enhanced in-band noise removal capability.展开更多
Recently, neighbor embedding based face super-resolution(SR) methods have shown the ability for achieving high-quality face images, those methods are based on the assumption that the same neighborhoods are preserved i...Recently, neighbor embedding based face super-resolution(SR) methods have shown the ability for achieving high-quality face images, those methods are based on the assumption that the same neighborhoods are preserved in both low-resolution(LR) training set and high-resolution(HR) training set. However, due to the "one-to-many" mapping between the LR image and HR ones in practice, the neighborhood relationship of the LR patch in LR space is quite different with that of the HR counterpart, that is to say the neighborhood relationship obtained is not true. In this paper, we explore a novel and effective re-identified K-nearest neighbor(RIKNN) method to search neighbors of LR patch. Compared with other methods, our method uses the geometrical information of LR manifold and HR manifold simultaneously. In particular, it searches K-NN of LR patch in the LR space and refines the searching results by re-identifying in the HR space, thus giving rise to accurate K-NN and improved performance. A statistical analysis of the influence of the training set size and nearest neighbor number is given, experimental results on some public face databases show the superiority of our proposed scheme over state-of-the-art face hallucination approaches in terms of subjective and objective results as well as computational complexity.展开更多
In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample...In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters.展开更多
In this paper, a manifold subspace learning algorithm based on locality preserving discriminant projection (LPDP) is used for speaker verification. LPDP can overcome the deficiency of the total variability factor anal...In this paper, a manifold subspace learning algorithm based on locality preserving discriminant projection (LPDP) is used for speaker verification. LPDP can overcome the deficiency of the total variability factor analysis and locality preserving projection (LPP). LPDP can effectively use the speaker label information of speech data. Through optimization, LPDP can maintain the inherent manifold local structure of the speech data samples of the same speaker by reducing the distance between them. At the same time, LPDP can enhance the discriminability of the embedding space by expanding the distance between the speech data samples of different speakers. The proposed method is compared with LPP and total variability factor analysis on the NIST SRE 2010 telephone-telephone core condition. The experimental results indicate that the proposed LPDP can overcome the deficiency of LPP and total variability factor analysis and can further improve the system performance.展开更多
Image classification is an essential task in content-based image retrieval.However,due to the semantic gap between low-level visual features and high-level semantic concepts,and the diversification of Web images,the p...Image classification is an essential task in content-based image retrieval.However,due to the semantic gap between low-level visual features and high-level semantic concepts,and the diversification of Web images,the performance of traditional classification approaches is far from users' expectations.In an attempt to reduce the semantic gap and satisfy the urgent requirements for dimensionality reduction,high-quality retrieval results,and batch-based processing,we propose a hierarchical image manifold with novel distance measures for calculation.Assuming that the images in an image set describe the same or similar object but have various scenes,we formulate two kinds of manifolds,object manifold and scene manifold,at different levels of semantic granularity.Object manifold is developed for object-level classification using an algorithm named extended locally linear embedding(ELLE) based on intra-and inter-object difference measures.Scene manifold is built for scene-level classification using an algorithm named locally linear submanifold extraction(LLSE) by combining linear perturbation and region growing.Experimental results show that our method is effective in improving the performance of classifying Web images.展开更多
Manifold learning has attracted considerable attention over the last decade,in which exploring the geometry and topology of the manifold is the central problem.Tangent space is a fundamental tool in discovering the ge...Manifold learning has attracted considerable attention over the last decade,in which exploring the geometry and topology of the manifold is the central problem.Tangent space is a fundamental tool in discovering the geometry of the manifold.In this paper,we will first review canonical manifold learning techniques and then discuss two fundamental problems in tangent space learning.One is how to estimate the tangent space from random samples,and the other is how to generalize tangent space to ambient space.Previous studies in tangent space learning have mainly focused on how to fit tangent space,and one has to solve a global equation for obtaining the tangent spaces.Unlike these approaches,we introduce a novel method,called persistent tangent space learning(PTSL),which estimates the tangent space at each local neighborhood while ensuring that the tangent spaces vary smoothly on the manifold.Tangent space can be viewed as a point on Grassmann manifold.Inspired from the statistics on Grassmann manifold,we use intrinsic sample total variance to measure the variation of estimated tangent spaces at a single point,and thus,the generalization problem can be solved by estimating the intrinsic sample mean on Grassmann manifold.We validate our methods by various experimental results both on synthetic and real data.展开更多
One paper in a preceding issue of this journal has introduced the Bayesian Ying-Yang(BYY)harmony learning from a perspective of problem solving,parameter learning,and model selection.In a complementary role,the paper ...One paper in a preceding issue of this journal has introduced the Bayesian Ying-Yang(BYY)harmony learning from a perspective of problem solving,parameter learning,and model selection.In a complementary role,the paper provides further insights from another perspective that a co-dimensional matrix pair(shortly co-dim matrix pair)forms a building unit and a hierarchy of such building units sets up the BYY system.The BYY harmony learning is re-examined via exploring the nature of a co-dim matrix pair,which leads to improved learning performance with refined model selection criteria and a modified mechanism that coordinates automatic model selection and sparse learning.Besides updating typical algorithms of factor analysis(FA),binary FA(BFA),binary matrix factorization(BMF),and nonnegative matrix factorization(NMF)to share such a mechanism,we are also led to(a)a new parametrization that embeds a de-noise nature to Gaussian mixture and local FA(LFA);(b)an alternative formulation of graph Laplacian based linear manifold learning;(c)a codecomposition of data and covariance for learning regularization and data integration;and(d)a co-dim matrix pair based generalization of temporal FA and state space model.Moreover,with help of a co-dim matrix pair in Hadamard product,we are led to a semi-supervised formation for regression analysis and a semi-blind learning formation for temporal FA and state space model.Furthermore,we address that these advances provide with new tools for network biology studies,including learning transcriptional regulatory,Protein-Protein Interaction network alignment,and network integration.展开更多
As parameter independent yet simple techniques,the energy operator(EO)and its variants have received considerable attention in the field of bearing fault feature detection.However,the performances of these improved EO...As parameter independent yet simple techniques,the energy operator(EO)and its variants have received considerable attention in the field of bearing fault feature detection.However,the performances of these improved EO techniques are subjected to the limited number of EOs,and they cannot reflect the non-linearity of the machinery dynamic systems and affect the noise reduction.As a result,the fault-related transients strengthened by these improved EO techniques are still subject to contamination of strong noises.To address these issues,this paper presents a novel EO fusion strategy for enhancing the bearing fault feature nonlinearly and effectively.Specifically,the proposed strategy is conducted through the following three steps.First,a multi-dimensional information matrix(MDIM)is constructed by performing the higher order energy operator(HOEO)on the analysis signal iteratively.MDIM is regarded as the fusion source of the proposed strategy with the properties of improving the signal-to-interference ratio and suppressing the noise in the low-frequency region.Second,an enhanced manifold learning algorithm is performed on the normalized MDIM to extract the intrinsic manifolds correlated with the fault-related impulses.Third,the intrinsic manifolds are weighted to recover the fault-related transients.Simulation studies and experimental verifications confirm that the proposed strategy is more effective for enhancing the bearing fault feature than the existing methods,including HOEOs,the weighting HOEO fusion,the fast Kurtogram,and the empirical mode decomposition.展开更多
One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data.Although substantial studies have been conducted in recent years,more effecti...One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data.Although substantial studies have been conducted in recent years,more effective methods are still strongly needed to infer the developmental processes accurately.This work devises a new method,named DTFLOW,for determining the pseudotemporal trajectories with multiple branches.DTFLOW consists of two major steps:a new method called Bhattacharyya kernel feature decomposition(BKFD)to reduce the data dimensions,and a novel approach named Reverse Searching on k-nearest neighbor graph(RSKG)to identify the multi-branching processes of cellular differentiation.In BKFD,we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm,and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix.The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets.We compare the efficiency of DTFLOW with the published state-of-the-art methods.Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories.The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.展开更多
This study leverages a high dimensional manifold learning design to explore the latent structure of the pandemic policymaking space only based on bill-level characteristics of pandemic-focused bills from 1973 to 2020....This study leverages a high dimensional manifold learning design to explore the latent structure of the pandemic policymaking space only based on bill-level characteristics of pandemic-focused bills from 1973 to 2020.Results indicate the COVID-19 era of policymaking maps extremely closely onto prior periods of related policymaking.This suggests that there is striking uniformity in Congressional policymaking related to these types of large-scale crises over time,despite currently operating in a unique era of hyperpolarization,division,and ineffective governance.展开更多
Locally linear embedding(LLE)algorithm has a distinct deficiency in practical application.It requires users to select the neighborhood parameter,k,which denotes the number of nearest neighbors.A new adaptive method is...Locally linear embedding(LLE)algorithm has a distinct deficiency in practical application.It requires users to select the neighborhood parameter,k,which denotes the number of nearest neighbors.A new adaptive method is presented based on supervised LLE in this article.A similarity measure is formed by utilizing the Fisher projection distance,and then it is used as a threshold to select k.Different samples will produce different k adaptively according to the density of the data distribution.The method is applied to classify plant leaves.The experimental results show that the average classification rate of this new method is up to 92.4%,which is much better than the results from the traditional LLE and supervised LLE.展开更多
基金This work is supported in part by the National Science Foundation of China(Grant Nos.61672392,61373038)in part by the National Key Research and Development Program of China(Grant No.2016YFC1202204).
文摘Software defect prediction plays a very important role in software quality assurance,which aims to inspect as many potentially defect-prone software modules as possible.However,the performance of the prediction model is susceptible to high dimensionality of the dataset that contains irrelevant and redundant features.In addition,software metrics for software defect prediction are almost entirely traditional features compared to the deep semantic feature representation from deep learning techniques.To address these two issues,we propose the following two solutions in this paper:(1)We leverage a novel non-linear manifold learning method-SOINN Landmark Isomap(SL-Isomap)to extract the representative features by selecting automatically the reasonable number and position of landmarks,which can reveal the complex intrinsic structure hidden behind the defect data.(2)We propose a novel defect prediction model named DLDD based on hybrid deep learning techniques,which leverages denoising autoencoder to learn true input features that are not contaminated by noise,and utilizes deep neural network to learn the abstract deep semantic features.We combine the squared error loss function of denoising autoencoder with the cross entropy loss function of deep neural network to achieve the best prediction performance by adjusting a hyperparameter.We compare the SL-Isomap with seven state-of-the-art feature extraction methods and compare the DLDD model with six baseline models across 20 open source software projects.The experimental results verify that the superiority of SL-Isomap and DLDD on four evaluation indicators.
基金Sponsored by the National Natural Science Foundation of China (Grant No. 61101122 and 61071105)Fundamental Research Funds for the Central Universities (Grant No. HIT. NSRIF. 2010090)+1 种基金Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory (Grant No. ITD-U12004)Postdoctoral Science Research Development Foundation of Heilongjiang Province (Grant No. LBH-Q12080)
文摘Recently manifold learning algorithm for dimensionality reduction attracts more and more interests, and various linear and nonlinear,global and local algorithms are proposed. The key step of manifold learning algorithm is the neighboring region selection. However,so far for the references we know,few of which propose a generally accepted algorithm to well select the neighboring region. So in this paper,we propose an adaptive neighboring selection algorithm,which successfully applies the LLE and ISOMAP algorithms in the test. It is an algorithm that can find the optimal K nearest neighbors of the data points on the manifold. And the theoretical basis of the algorithm is the approximated curvature of the data point on the manifold. Based on Riemann Geometry,Jacob matrix is a proper mathematical concept to predict the approximated curvature. By verifying the proposed algorithm on embedding Swiss roll from R3 to R2 based on LLE and ISOMAP algorithm,the simulation results show that the proposed adaptive neighboring selection algorithm is feasible and able to find the optimal value of K,making the residual variance relatively small and better visualization of the results. By quantitative analysis,the embedding quality measured by residual variance is increased 45. 45% after using the proposed algorithm in LLE.
基金the National Natural Science Foundation of China(No.50775219)
文摘In view of the incremental learning problem of manifold learning algorithm, an adaptive neighborhood incremental principal component analysis(PCA) and locality preserving projection(LPP) manifold learning algorithm is presented, and the incremental learning principle of algorithm is introduced. For incremental sample data, the adjacency and covariance matrices are incrementally updated by the existing samples; then the dimensionality reduction results of the incremental samples are estimated by the dimensionality reduction results of the existing samples; finally, the dimensionality reduction results of the incremental and existing samples are updated by subspace iteration method. The adaptive neighborhood incremental PCA-LPP manifold learning algorithm is applied to processing of gearbox fault signals. The dimensionality reduction results by incremental learning have very small error, compared with those by batch learning. Spatial aggregation of the incremental samples is basically stable, and fault identification rate is increased.
基金supported in part by the National Natural Science Foundation of China(Nos.61373093,61402310,61672364,and 61672365)the National Key Research and Development Program of China(No.2018YFA0701701)。
文摘In this paper,we propose an Unsupervised Nonlinear Adaptive Manifold Learning method(UNAML)that considers both global and local information.In this approach,we apply unlabeled training samples to study nonlinear manifold features,while considering global pairwise distances and maintaining local topology structure.Our method aims at minimizing global pairwise data distance errors as well as local structural errors.In order to enable our UNAML to be more efficient and to extract manifold features from the external source of new data,we add a feature approximate error that can be used to learn a linear extractor.Also,we add a feature approximate error that can be used to learn a linear extractor.In addition,we use a method of adaptive neighbor selection to calculate local structural errors.This paper uses the kernel matrix method to optimize the original algorithm.Our algorithm proves to be more effective when compared with the experimental results of other feature extraction methods on real face-data sets and object data sets.
基金supported by Research Grant from the Kajima Foundation,JST CREST Grant No.JPMJCR1911,JapanJSPS KAKENHI(Nos.17K06633,21K04351).
文摘Data-driven computing in elasticity attempts to directly use experimental data on material,without constructing an empirical model of the constitutive relation,to predict an equilibrium state of a structure subjected to a specified external load.Provided that a data set comprising stress-strain pairs of material is available,a data-driven method using the kernel method and the regularized least-squares was developed to extract a manifold on which the points in the data set approximately lie(Kanno 2021,Jpn.J.Ind.Appl.Math.).From the perspective of physical experiments,stress field cannot be directly measured,while displacement and force fields are measurable.In this study,we extend the previous kernel method to the situation that pairs of displacement and force,instead of pairs of stress and strain,are available as an input data set.A new regularized least-squares problem is formulated in this problem setting,and an alternating minimization algorithm is proposed to solve the problem.
基金supported by the National Natural Science Foundation of China under Grant Nos 91220301,61273314 and 61175064.
文摘Purpose-Isometric feature mapping(Isomap)is a very popular manifold learning method and is widely used in dimensionality reduction and data visualization.The most time-consuming step in Isomap is to compute the shortest paths between all pairs of data points based on a neighbourhood graph.The classical Isomap(C-Isomap)is very slow,due to the use of Floyd’s algorithm to compute the shortest paths.The purpose of this paper is to speed up Isomap.Design/methodology/approach-Through theoretical analysis,it is found that the neighbourhood graph in Isomap is sparse.In this case,the Dijkstra’s algorithm with Fibonacci heap(Fib-Dij)is faster than Floyd’s algorithm.In this paper,an improved Isomap method based on Fib-Dij is proposed.By using Fib-Dij to replace Floyd’s algorithm,an improved Isomap method is presented in this paper.Findings-Using the S-curve,the Swiss-roll,the Frey face database,the mixed national institute of standards and technology database of handwritten digits and a face image database,the performance of the proposed method is compared with C-Isomap,showing the consistency with C-Isomap and marked improvements in terms of the high speed.Simulations also demonstrate that Fib-Dij reduces the computation time of the shortest paths from O(N3)to O(N2lgN).Research limitations/implications-Due to the limitations of the computer,the sizes of the data sets in this paper are all smaller than 3,000.Therefore,researchers are encouraged to test the proposed algorithm on larger data sets.Originality/value-The new method based on Fib-Dij can greatly improve the speed of Isomap.
文摘As modern weapons and equipment undergo increasing levels of informatization,intelligence,and networking,the topology and traffic characteristics of battlefield data networks built with tactical data links are becoming progressively complex.In this paper,we employ a traffic matrix to model the tactical data link network.We propose a method that utilizes the Maximum Variance Unfolding(MVU)algorithm to conduct nonlinear dimensionality reduction analysis on high-dimensional open network traffic matrix datasets.This approach introduces novel ideas and methods for future applications,including traffic prediction and anomaly analysis in real battlefield network environments.
基金Supported by National Natural Science Foundation of China (Grant Nos. 51805342,51875376, 52007128)Jiangsu Provincial Natural Science Foundation of China (Grant No. BK20180842)+2 种基金China Postdoctoral Science Foundation (Grant Nos. 2021M692354, 2018M640514)Suzhou Prospective Research Program of China (Grant No. SYG201932)Jiangsu Provincial Natural Science Fund for Colleges and Universities of China (Grant No. 18KJB470022)。
文摘Variational mode decomposition(VMD) has been proved to be useful for extraction of fault-induced transients of rolling bearings. Multi-bandwidth mode manifold(Triple M, TM) is one variation of the VMD, which units multiple fault-related modes with different bandwidths by a nonlinear manifold learning algorithm named local tangent space alignment(LTSA). The merit of the TM method is that the bearing fault-induced transients extracted contain low level of in-band noise without optimization of the VMD parameters. However, the determination of the neighborhood size of the LTSA is time-consuming, and the extracted fault-induced transients may have the problem of asymmetry in the up-and-down direction. This paper aims to improve the efficiency and waveform symmetry of the TM method.Specifically, the multi-bandwidth modes consisting of the fault-related modes with different bandwidths are first obtained by repeating the recycling VMD(RVMD) method with different bandwidth balance parameters. Then, the LTSA algorithm is performed on the multi-bandwidth modes to extract their inherent manifold structure, in which the natural nearest neighbor(Triple N, TN) algorithm is adopted to efficiently and reasonably select the neighbors of each data point in the multi-bandwidth modes. Finally, a weight-based feature compensation strategy is designed to synthesize the low-dimensional manifold features to alleviate the asymmetry problem, resulting in a symmetric TM feature that can represent the real fault transient components. The major contribution of the improved TM method for bearing fault diagnosis is that the pure fault-induced transients are extracted efficiently and are symmetrical as the real. One simulation analysis and two experimental applications in bearing fault diagnosis validate the enhanced performance of the improved TM method over the traditional methods. This research proposes a bearing fault diagnosis method which has the advantages of high efficiency, good waveform symmetry and enhanced in-band noise removal capability.
基金supported by the National Natural Science Foundation of China(61172173,61303114,61271256,61272544,U1304615,U1404618)the National High Technology Research and Development Program of China(863 Program)No.2013AA014602
文摘Recently, neighbor embedding based face super-resolution(SR) methods have shown the ability for achieving high-quality face images, those methods are based on the assumption that the same neighborhoods are preserved in both low-resolution(LR) training set and high-resolution(HR) training set. However, due to the "one-to-many" mapping between the LR image and HR ones in practice, the neighborhood relationship of the LR patch in LR space is quite different with that of the HR counterpart, that is to say the neighborhood relationship obtained is not true. In this paper, we explore a novel and effective re-identified K-nearest neighbor(RIKNN) method to search neighbors of LR patch. Compared with other methods, our method uses the geometrical information of LR manifold and HR manifold simultaneously. In particular, it searches K-NN of LR patch in the LR space and refines the searching results by re-identifying in the HR space, thus giving rise to accurate K-NN and improved performance. A statistical analysis of the influence of the training set size and nearest neighbor number is given, experimental results on some public face databases show the superiority of our proposed scheme over state-of-the-art face hallucination approaches in terms of subjective and objective results as well as computational complexity.
基金Supported by the National Defense Pre-Research Foundation of China (Grant No.9140A05070107BQ0204)
文摘In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters.
文摘In this paper, a manifold subspace learning algorithm based on locality preserving discriminant projection (LPDP) is used for speaker verification. LPDP can overcome the deficiency of the total variability factor analysis and locality preserving projection (LPP). LPDP can effectively use the speaker label information of speech data. Through optimization, LPDP can maintain the inherent manifold local structure of the speech data samples of the same speaker by reducing the distance between them. At the same time, LPDP can enhance the discriminability of the embedding space by expanding the distance between the speech data samples of different speakers. The proposed method is compared with LPP and total variability factor analysis on the NIST SRE 2010 telephone-telephone core condition. The experimental results indicate that the proposed LPDP can overcome the deficiency of LPP and total variability factor analysis and can further improve the system performance.
基金Project supported by the National High-Tech R & D Program (863) of China (No. 2009AA011900)the Zhejiang Provincial Natural Science Foundation of China (No. 2011Y1110960)the Zhejiang Provincial Nonprofit Technology and Application Research Program of China (Nos. 2011C31045 and 2012C21020)
文摘Image classification is an essential task in content-based image retrieval.However,due to the semantic gap between low-level visual features and high-level semantic concepts,and the diversification of Web images,the performance of traditional classification approaches is far from users' expectations.In an attempt to reduce the semantic gap and satisfy the urgent requirements for dimensionality reduction,high-quality retrieval results,and batch-based processing,we propose a hierarchical image manifold with novel distance measures for calculation.Assuming that the images in an image set describe the same or similar object but have various scenes,we formulate two kinds of manifolds,object manifold and scene manifold,at different levels of semantic granularity.Object manifold is developed for object-level classification using an algorithm named extended locally linear embedding(ELLE) based on intra-and inter-object difference measures.Scene manifold is built for scene-level classification using an algorithm named locally linear submanifold extraction(LLSE) by combining linear perturbation and region growing.Experimental results show that our method is effective in improving the performance of classifying Web images.
基金supported by the National Natural Science Foundation of China(Grant No.60875044).
文摘Manifold learning has attracted considerable attention over the last decade,in which exploring the geometry and topology of the manifold is the central problem.Tangent space is a fundamental tool in discovering the geometry of the manifold.In this paper,we will first review canonical manifold learning techniques and then discuss two fundamental problems in tangent space learning.One is how to estimate the tangent space from random samples,and the other is how to generalize tangent space to ambient space.Previous studies in tangent space learning have mainly focused on how to fit tangent space,and one has to solve a global equation for obtaining the tangent spaces.Unlike these approaches,we introduce a novel method,called persistent tangent space learning(PTSL),which estimates the tangent space at each local neighborhood while ensuring that the tangent spaces vary smoothly on the manifold.Tangent space can be viewed as a point on Grassmann manifold.Inspired from the statistics on Grassmann manifold,we use intrinsic sample total variance to measure the variation of estimated tangent spaces at a single point,and thus,the generalization problem can be solved by estimating the intrinsic sample mean on Grassmann manifold.We validate our methods by various experimental results both on synthetic and real data.
基金supported by the General Research Fund from Research Grant Council of Hong Kong(Project No.CUHK4180/10E)the National Basic Research Program of China(973 Program)(No.2009CB825404).
文摘One paper in a preceding issue of this journal has introduced the Bayesian Ying-Yang(BYY)harmony learning from a perspective of problem solving,parameter learning,and model selection.In a complementary role,the paper provides further insights from another perspective that a co-dimensional matrix pair(shortly co-dim matrix pair)forms a building unit and a hierarchy of such building units sets up the BYY system.The BYY harmony learning is re-examined via exploring the nature of a co-dim matrix pair,which leads to improved learning performance with refined model selection criteria and a modified mechanism that coordinates automatic model selection and sparse learning.Besides updating typical algorithms of factor analysis(FA),binary FA(BFA),binary matrix factorization(BMF),and nonnegative matrix factorization(NMF)to share such a mechanism,we are also led to(a)a new parametrization that embeds a de-noise nature to Gaussian mixture and local FA(LFA);(b)an alternative formulation of graph Laplacian based linear manifold learning;(c)a codecomposition of data and covariance for learning regularization and data integration;and(d)a co-dim matrix pair based generalization of temporal FA and state space model.Moreover,with help of a co-dim matrix pair in Hadamard product,we are led to a semi-supervised formation for regression analysis and a semi-blind learning formation for temporal FA and state space model.Furthermore,we address that these advances provide with new tools for network biology studies,including learning transcriptional regulatory,Protein-Protein Interaction network alignment,and network integration.
基金supported by the National Natural Science Foundation of China (Grant Nos.52172406 and 51875376)the China Postdoctoral Science Foundation (Grant Nos.2022T150552 and 2021M702752)the Suzhou Prospective Research Program,China (Grant No.SYG202111)。
文摘As parameter independent yet simple techniques,the energy operator(EO)and its variants have received considerable attention in the field of bearing fault feature detection.However,the performances of these improved EO techniques are subjected to the limited number of EOs,and they cannot reflect the non-linearity of the machinery dynamic systems and affect the noise reduction.As a result,the fault-related transients strengthened by these improved EO techniques are still subject to contamination of strong noises.To address these issues,this paper presents a novel EO fusion strategy for enhancing the bearing fault feature nonlinearly and effectively.Specifically,the proposed strategy is conducted through the following three steps.First,a multi-dimensional information matrix(MDIM)is constructed by performing the higher order energy operator(HOEO)on the analysis signal iteratively.MDIM is regarded as the fusion source of the proposed strategy with the properties of improving the signal-to-interference ratio and suppressing the noise in the low-frequency region.Second,an enhanced manifold learning algorithm is performed on the normalized MDIM to extract the intrinsic manifolds correlated with the fault-related impulses.Third,the intrinsic manifolds are weighted to recover the fault-related transients.Simulation studies and experimental verifications confirm that the proposed strategy is more effective for enhancing the bearing fault feature than the existing methods,including HOEOs,the weighting HOEO fusion,the fast Kurtogram,and the empirical mode decomposition.
基金the National Natural Science Foundation of China(Grant Nos.11571368,11931019,11775314,and 11871238)the Fundamental Research Funds for the Central Universities,China(Grant No.2662019QD031).
文摘One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data.Although substantial studies have been conducted in recent years,more effective methods are still strongly needed to infer the developmental processes accurately.This work devises a new method,named DTFLOW,for determining the pseudotemporal trajectories with multiple branches.DTFLOW consists of two major steps:a new method called Bhattacharyya kernel feature decomposition(BKFD)to reduce the data dimensions,and a novel approach named Reverse Searching on k-nearest neighbor graph(RSKG)to identify the multi-branching processes of cellular differentiation.In BKFD,we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm,and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix.The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets.We compare the efficiency of DTFLOW with the published state-of-the-art methods.Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories.The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.
文摘This study leverages a high dimensional manifold learning design to explore the latent structure of the pandemic policymaking space only based on bill-level characteristics of pandemic-focused bills from 1973 to 2020.Results indicate the COVID-19 era of policymaking maps extremely closely onto prior periods of related policymaking.This suggests that there is striking uniformity in Congressional policymaking related to these types of large-scale crises over time,despite currently operating in a unique era of hyperpolarization,division,and ineffective governance.
基金This study was financially supported by the National Natural Science Foundation of China(61172127)the Research Fund for the Doctoral Program of Higher Education(KJQN1114)+2 种基金Anhui Provincial Natural Science Foundation(1308085QC58)the 211 Project Youth Scientific Research Fund of Anhui UniversityProvincial Natural Science Foundation of Anhui Universities(KJ2013A026)。
文摘Locally linear embedding(LLE)algorithm has a distinct deficiency in practical application.It requires users to select the neighborhood parameter,k,which denotes the number of nearest neighbors.A new adaptive method is presented based on supervised LLE in this article.A similarity measure is formed by utilizing the Fisher projection distance,and then it is used as a threshold to select k.Different samples will produce different k adaptively according to the density of the data distribution.The method is applied to classify plant leaves.The experimental results show that the average classification rate of this new method is up to 92.4%,which is much better than the results from the traditional LLE and supervised LLE.