Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can...Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.展开更多
Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smar...Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.展开更多
Aiming at the problems of bispectral analysis when applied to machinery fault diagnosis, a machinery fault feature extraction method based on sparseness-controlled non-negative tensor factorization (SNTF) is propose...Aiming at the problems of bispectral analysis when applied to machinery fault diagnosis, a machinery fault feature extraction method based on sparseness-controlled non-negative tensor factorization (SNTF) is proposed. First, a non-negative tensor factorization(NTF) algorithm is improved by imposing sparseness constraints on it. Secondly, the bispectral images of mechanical signals are obtained and stacked to form a third-order tensor. Thirdly, the improved algorithm is used to extract features, which are represented by a series of basis images from this tensor. Finally, coefficients indicating these basis images' weights in constituting original bispectral images are calculated for fault classification. Experiments on fault diagnosis of gearboxes show that the extracted features can not only reveal some nonlinear characteristics of the system, but also have intuitive meanings with regard to fault characteristic frequencies. These features provide great convenience for the interpretation of the relationships between machinery faults and corresponding bispectra.展开更多
A novel framework is proposed to obtain physiologically meaningful features for Alzheimer's disease(AD)classification based on sparse functional connectivity and non-negative matrix factorization.Specifically,the ...A novel framework is proposed to obtain physiologically meaningful features for Alzheimer's disease(AD)classification based on sparse functional connectivity and non-negative matrix factorization.Specifically,the non-negative adaptive sparse representation(NASR)method is applied to compute the sparse functional connectivity among brain regions based on functional magnetic resonance imaging(fMRI)data for feature extraction.Afterwards,the sparse non-negative matrix factorization(sNMF)method is adopted for dimensionality reduction to obtain low-dimensional features with straightforward physical meaning.The experimental results show that the proposed framework outperforms the competing frameworks in terms of classification accuracy,sensitivity and specificity.Furthermore,three sub-networks,including the default mode network,the basal ganglia-thalamus-limbic network and the temporal-insular network,are found to have notable differences between the AD patients and the healthy subjects.The proposed framework can effectively identify AD patients and has potentials for extending the understanding of the pathological changes of AD.展开更多
This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorizati...This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorization by total variation constraint and graph regularization.The main contributions of our work are the following.First,total variation is incorporated into NMF to control the diffusion speed.The purpose is to denoise in smooth regions and preserve features or details of the data in edge regions by using a diffusion coefficient based on gradient information.Second,we add graph regularization into NMF to reveal intrinsic geometry and structure information of features to enhance the discrimination power.Third,the multiplicative update rules and proof of convergence of the TV-GNMF algorithm are given.Experiments conducted on datasets show that the proposed TV-GNMF method outperforms other state-of-the-art algorithms.展开更多
Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decompos...Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decomposition. The three-dimension nonnegative matrix factorization (NMF3) algorithm, which was concise and easy to implement, was given in this paper. The NMF3 algorithm implementation was based on elements but not on vectors. It could decompose a data array directly without unfolding, which was not similar to that the traditional algorithms do, It has been applied to the simulated data array decomposition and obtained reasonable results. It showed that NMF3 could be introduced for curve resolution in chemometrics.展开更多
Underwater direction of arrival(DOA)estimation has always been a very challenging theoretical and practical problem.Due to the serious non-stationary,non-linear,and non-Gaussian characteristics,machine learning based ...Underwater direction of arrival(DOA)estimation has always been a very challenging theoretical and practical problem.Due to the serious non-stationary,non-linear,and non-Gaussian characteristics,machine learning based DOA estimation methods trained on simulated Gaussian noised array data cannot be directly applied to actual underwater DOA estimation tasks.In order to deal with this problem,environmental data with no target echoes can be employed to analyze the non-Gaussian components.Then,the obtained information about non-Gaussian components can be used to whiten the array data.Based on these considerations,a novel practical sonar array whitening method was proposed.Specifically,based on a weak assumption that the non-Gaussian components in adjacent patches with and without target echoes are almost the same,canonical cor-relation analysis(CCA)and non-negative matrix factorization(NMF)techniques are employed for whitening the array data.With the whitened array data,machine learning based DOA estimation models trained on simulated Gaussian noised datasets can be used to perform underwater DOA estimation tasks.Experimental results illustrated that,using actual underwater datasets for testing with known machine learning based DOA estimation models,accurate and robust DOA estimation performance can be achieved by using the proposed whitening method in different underwater con-ditions.展开更多
Many problems in image representation and classification involve some form of dimensionality reduction. Nonnegative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially loc...Many problems in image representation and classification involve some form of dimensionality reduction. Nonnegative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially localized, partsbased subspace representation of objects. An improvement of the classical NMF by combining with Log-Gabor wavelets to enhance its part-based learning ability is presented. The new method with principal component analysis (PCA) and locally linear embedding (LIE) proposed recently in Science are compared. Finally, the new method to several real world datasets and achieve good performance in representation and classification is applied.展开更多
Working memory plays an important role in human cognition. This study investigated how working memory was encoded by the power of multichannel local field potentials (LFPs) based on sparse non negative matrix factor...Working memory plays an important role in human cognition. This study investigated how working memory was encoded by the power of multichannel local field potentials (LFPs) based on sparse non negative matrix factorization (SNMF). SNMF was used to extract features from LFPs recorded from the prefrontal cortex of four SpragueDawley rats during a memory task in a Y maze, with 10 trials for each rat. Then the powerincreased LFP components were selected as working memoryrelated features and the other components were removed. After that, the inverse operation of SNMF was used to study the encoding of working memory in the time frequency domain. We demonstrated that theta and gamma power increased significantly during the working memory task. The results suggested that postsynaptic activity was simulated well by the sparse activity model. The theta and gamma bands were meaningful for encoding working memory.展开更多
Aiming at the low recognition accuracy of non-negative matrix factorization(NMF)in practical application,an improved spare graph NMF(New-SGNMF)is proposed in this paper.New-SGNMF makes full use of the inherent geometr...Aiming at the low recognition accuracy of non-negative matrix factorization(NMF)in practical application,an improved spare graph NMF(New-SGNMF)is proposed in this paper.New-SGNMF makes full use of the inherent geometric structure of image data to optimize the basis matrix in two steps.A threshold value s was first set to judge the threshold value of the decomposed base matrix to filter the redundant information in the data.Using L2 norm,sparse constraints were then implemented on the basis matrix,and integrated into the objective function to obtain the objective function of New-SGNMF.In addition,the derivation process of the algorithm and the convergence analysis of the algorithm were given.The experimental results on COIL20,PIE-pose09 and YaleB database show that compared with K-means,PCA,NMF and other algorithms,the proposed algorithm has higher accuracy and normalized mutual information.展开更多
Constrained spectral non-negative matrix factorization(NMF)analysis of perturbed oscillatory process control loop variable data is performed for the isolation of multiple plant-wide oscillatory sources.The technique i...Constrained spectral non-negative matrix factorization(NMF)analysis of perturbed oscillatory process control loop variable data is performed for the isolation of multiple plant-wide oscillatory sources.The technique is described and demonstrated by analyzing data from both simulated and real plant data of a chemical process plant. Results show that the proposed approach can map multiple oscillatory sources onto the most appropriate control loops,and has superior performance in terms of reconstruction accuracy and intuitive understanding compared with spectral independent component analysis(ICA).展开更多
Hierarchical topic model has been widely applied in many real applications, because it can build a hierarchy on topics with guaranteeing of topics' quality. Most of traditional methods build a hierarchy by adopting l...Hierarchical topic model has been widely applied in many real applications, because it can build a hierarchy on topics with guaranteeing of topics' quality. Most of traditional methods build a hierarchy by adopting low-level topics as new features to construct high-level ones, which will often cause semantic confusion between low-level topics and high-level ones. To address the above problem, we propose a novel topic model named hierarchical sparse NMF with orthogonal constraint (HSOC), which is based on non-negative matrix factorization and builds topic hierarchy via splitting super-topics into sub-topics. In HSOC, we introduce global independence, local independence and information consistency to constraint the split topics. Extensive experimental results on real-world corpora show that the purposed model achieves comparable performance on topic quality and better performance on semantic feature representation of documents compared with baseline methods.展开更多
Non-negative matrix factorization(NMF) has been widely used in mixture analysis for hyperspectral remote sensing. When used for spectral unmixing analysis, however, it has two main shortcomings:(1) since the dimension...Non-negative matrix factorization(NMF) has been widely used in mixture analysis for hyperspectral remote sensing. When used for spectral unmixing analysis, however, it has two main shortcomings:(1) since the dimensionality of hyperspectral data is usually very large, NMF tends to suffer from large computational complexity for the popular multiplicative iteration rule;(2) NMF is sensitive to noise(outliers), and thus the corrupted data will make the results of NMF meaningless. Although principal component analysis(PCA) can be used to mitigate these two problems, the transformed data will contain negative numbers, hindering the direct use of the multiplicative iteration rule of NMF. In this paper, we analyze the impact of PCA on NMF, and find that multiplicative NMF can also be applicable to data after principal component transformation. Based on this conclusion, we present a method to perform NMF in the principal component space, named ‘principal component NMF'(PCNMF). Experimental results show that PCNMF is both accurate and time-saving.展开更多
Nonnegative Matrix Factorization(NMF)is a powerful technique to perform dimension reduction and pattern recognition through single-layer data representation learning.However,deep learning networks,with their carefully...Nonnegative Matrix Factorization(NMF)is a powerful technique to perform dimension reduction and pattern recognition through single-layer data representation learning.However,deep learning networks,with their carefully designed hierarchical structure,can combine hidden features to form more representative features for pattern recognition.In this paper,we proposed sparse deep NMF models to analyze complex data for more accurate classification and better feature interpretation.Such models are designed to learn localized features or generate more discriminative representations for samples in distinct classes by imposing L1-norm penalty on the columns of certain factors.By extending a one-layer model into a multilayer model with sparsity,we provided a hierarchical way to analyze big data and intuitively extract hidden features due to nonnegativity.We adopted the Nesterov’s accelerated gradient algorithm to accelerate the computing process.We also analyzed the computing complexity of our frameworks to demonstrate their efficiency.To improve the performance of dealing with linearly inseparable data,we also considered to incorporate popular nonlinear functions into these frameworks and explored their performance.We applied our models using two benchmarking image datasets,and the results showed that our models can achieve competitive or better classification performance and produce intuitive interpretations compared with the typical NMF and competing multilayer models.展开更多
This study was on superiority of the non- negative matrix factorization(NMF) algorithm for application of information extracted with aerial images.First,NMF was used for aerial image information extraction,and then ...This study was on superiority of the non- negative matrix factorization(NMF) algorithm for application of information extracted with aerial images.First,NMF was used for aerial image information extraction,and then this data was compared with a principal component analysis(PCA) in which r(the number of rows or columns of basic matrix) and E<sub>ignum</sub>(the number of eigenvalues) were given different values.Experimental results showed that the run time of NMF with r = 20 or 50 was less than that of PCA with an E<sub>ignum</sub> = 20 or 50.Also,the recognition rate of NMF with r = 50 was higher than that of an E<sub>ignum</sub> = 50.The experiment showed that nonnegative matrix factorization had advantages of a short time period with a high recognition rate.展开更多
文摘Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.
基金Supported by Shaanxi Provincial Overall Innovation Project of Science and Technology,China(Grant No.2013KTCQ01-06)
文摘Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.
基金The National Natural Science Foundation of China (No.50875048)the Natural Science Foundation of Jiangsu Province (No.BK2007115)the National High Technology Research and Development Program of China (863 Program)(No.2007AA04Z421)
文摘Aiming at the problems of bispectral analysis when applied to machinery fault diagnosis, a machinery fault feature extraction method based on sparseness-controlled non-negative tensor factorization (SNTF) is proposed. First, a non-negative tensor factorization(NTF) algorithm is improved by imposing sparseness constraints on it. Secondly, the bispectral images of mechanical signals are obtained and stacked to form a third-order tensor. Thirdly, the improved algorithm is used to extract features, which are represented by a series of basis images from this tensor. Finally, coefficients indicating these basis images' weights in constituting original bispectral images are calculated for fault classification. Experiments on fault diagnosis of gearboxes show that the extracted features can not only reveal some nonlinear characteristics of the system, but also have intuitive meanings with regard to fault characteristic frequencies. These features provide great convenience for the interpretation of the relationships between machinery faults and corresponding bispectra.
基金The Foundation of Hygiene and Health of Jiangsu Province(No.H2018042)the National Natural Science Foundation of China(No.61773114)the Key Research and Development Plan(Industry Foresight and Common Key Technology)of Jiangsu Province(No.BE2017007-3)
文摘A novel framework is proposed to obtain physiologically meaningful features for Alzheimer's disease(AD)classification based on sparse functional connectivity and non-negative matrix factorization.Specifically,the non-negative adaptive sparse representation(NASR)method is applied to compute the sparse functional connectivity among brain regions based on functional magnetic resonance imaging(fMRI)data for feature extraction.Afterwards,the sparse non-negative matrix factorization(sNMF)method is adopted for dimensionality reduction to obtain low-dimensional features with straightforward physical meaning.The experimental results show that the proposed framework outperforms the competing frameworks in terms of classification accuracy,sensitivity and specificity.Furthermore,three sub-networks,including the default mode network,the basal ganglia-thalamus-limbic network and the temporal-insular network,are found to have notable differences between the AD patients and the healthy subjects.The proposed framework can effectively identify AD patients and has potentials for extending the understanding of the pathological changes of AD.
基金supported by the National Natural Science Foundation of China(61702251,41971424,61701191,U1605254)the Natural Science Basic Research Plan in Shaanxi Province of China(2018JM6030)+4 种基金the Key Technical Project of Fujian Province(2017H6015)the Science and Technology Project of Xiamen(3502Z20183032)the Doctor Scientific Research Starting Foundation of Northwest University(338050050)Youth Academic Talent Support Program of Northwest University(360051900151)the Natural Sciences and Engineering Research Council of Canada,Canada。
文摘This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorization by total variation constraint and graph regularization.The main contributions of our work are the following.First,total variation is incorporated into NMF to control the diffusion speed.The purpose is to denoise in smooth regions and preserve features or details of the data in edge regions by using a diffusion coefficient based on gradient information.Second,we add graph regularization into NMF to reveal intrinsic geometry and structure information of features to enhance the discrimination power.Third,the multiplicative update rules and proof of convergence of the TV-GNMF algorithm are given.Experiments conducted on datasets show that the proposed TV-GNMF method outperforms other state-of-the-art algorithms.
文摘Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decomposition. The three-dimension nonnegative matrix factorization (NMF3) algorithm, which was concise and easy to implement, was given in this paper. The NMF3 algorithm implementation was based on elements but not on vectors. It could decompose a data array directly without unfolding, which was not similar to that the traditional algorithms do, It has been applied to the simulated data array decomposition and obtained reasonable results. It showed that NMF3 could be introduced for curve resolution in chemometrics.
基金supported by the National Natural Science Foundation of China(No.51279033).
文摘Underwater direction of arrival(DOA)estimation has always been a very challenging theoretical and practical problem.Due to the serious non-stationary,non-linear,and non-Gaussian characteristics,machine learning based DOA estimation methods trained on simulated Gaussian noised array data cannot be directly applied to actual underwater DOA estimation tasks.In order to deal with this problem,environmental data with no target echoes can be employed to analyze the non-Gaussian components.Then,the obtained information about non-Gaussian components can be used to whiten the array data.Based on these considerations,a novel practical sonar array whitening method was proposed.Specifically,based on a weak assumption that the non-Gaussian components in adjacent patches with and without target echoes are almost the same,canonical cor-relation analysis(CCA)and non-negative matrix factorization(NMF)techniques are employed for whitening the array data.With the whitened array data,machine learning based DOA estimation models trained on simulated Gaussian noised datasets can be used to perform underwater DOA estimation tasks.Experimental results illustrated that,using actual underwater datasets for testing with known machine learning based DOA estimation models,accurate and robust DOA estimation performance can be achieved by using the proposed whitening method in different underwater con-ditions.
文摘Many problems in image representation and classification involve some form of dimensionality reduction. Nonnegative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially localized, partsbased subspace representation of objects. An improvement of the classical NMF by combining with Log-Gabor wavelets to enhance its part-based learning ability is presented. The new method with principal component analysis (PCA) and locally linear embedding (LIE) proposed recently in Science are compared. Finally, the new method to several real world datasets and achieve good performance in representation and classification is applied.
基金supported by the National Natural Science Foundation of China (61074131 and 91132722)the Doctoral Fund of the Ministry of Education of China (21101202110007)
文摘Working memory plays an important role in human cognition. This study investigated how working memory was encoded by the power of multichannel local field potentials (LFPs) based on sparse non negative matrix factorization (SNMF). SNMF was used to extract features from LFPs recorded from the prefrontal cortex of four SpragueDawley rats during a memory task in a Y maze, with 10 trials for each rat. Then the powerincreased LFP components were selected as working memoryrelated features and the other components were removed. After that, the inverse operation of SNMF was used to study the encoding of working memory in the time frequency domain. We demonstrated that theta and gamma power increased significantly during the working memory task. The results suggested that postsynaptic activity was simulated well by the sparse activity model. The theta and gamma bands were meaningful for encoding working memory.
基金This work was supported by the National Natural Science Foundation of China(Grant No.61501005)the Anhui Natural Science Foundation(Grant No.1608085 MF 147)+2 种基金the Natural Science Foundation of Anhui Universities(Grant No.KJ2016A057)the Industry Collaborative Innovation Fund of Anhui Polytechnic University and Jiujiang District(Grant No.2021cyxtb4)the Science Research Project of Anhui Polytechnic University(Grant No.Xjky2020120).
文摘Aiming at the low recognition accuracy of non-negative matrix factorization(NMF)in practical application,an improved spare graph NMF(New-SGNMF)is proposed in this paper.New-SGNMF makes full use of the inherent geometric structure of image data to optimize the basis matrix in two steps.A threshold value s was first set to judge the threshold value of the decomposed base matrix to filter the redundant information in the data.Using L2 norm,sparse constraints were then implemented on the basis matrix,and integrated into the objective function to obtain the objective function of New-SGNMF.In addition,the derivation process of the algorithm and the convergence analysis of the algorithm were given.The experimental results on COIL20,PIE-pose09 and YaleB database show that compared with K-means,PCA,NMF and other algorithms,the proposed algorithm has higher accuracy and normalized mutual information.
基金Supported by the Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry.
文摘Constrained spectral non-negative matrix factorization(NMF)analysis of perturbed oscillatory process control loop variable data is performed for the isolation of multiple plant-wide oscillatory sources.The technique is described and demonstrated by analyzing data from both simulated and real plant data of a chemical process plant. Results show that the proposed approach can map multiple oscillatory sources onto the most appropriate control loops,and has superior performance in terms of reconstruction accuracy and intuitive understanding compared with spectral independent component analysis(ICA).
文摘Hierarchical topic model has been widely applied in many real applications, because it can build a hierarchy on topics with guaranteeing of topics' quality. Most of traditional methods build a hierarchy by adopting low-level topics as new features to construct high-level ones, which will often cause semantic confusion between low-level topics and high-level ones. To address the above problem, we propose a novel topic model named hierarchical sparse NMF with orthogonal constraint (HSOC), which is based on non-negative matrix factorization and builds topic hierarchy via splitting super-topics into sub-topics. In HSOC, we introduce global independence, local independence and information consistency to constraint the split topics. Extensive experimental results on real-world corpora show that the purposed model achieves comparable performance on topic quality and better performance on semantic feature representation of documents compared with baseline methods.
文摘Non-negative matrix factorization(NMF) has been widely used in mixture analysis for hyperspectral remote sensing. When used for spectral unmixing analysis, however, it has two main shortcomings:(1) since the dimensionality of hyperspectral data is usually very large, NMF tends to suffer from large computational complexity for the popular multiplicative iteration rule;(2) NMF is sensitive to noise(outliers), and thus the corrupted data will make the results of NMF meaningless. Although principal component analysis(PCA) can be used to mitigate these two problems, the transformed data will contain negative numbers, hindering the direct use of the multiplicative iteration rule of NMF. In this paper, we analyze the impact of PCA on NMF, and find that multiplicative NMF can also be applicable to data after principal component transformation. Based on this conclusion, we present a method to perform NMF in the principal component space, named ‘principal component NMF'(PCNMF). Experimental results show that PCNMF is both accurate and time-saving.
基金supported by the National Natural Science Foundation of China(Nos.11661141019 and 61621003)the National Ten Thousand Talent Program for Young Topnotch Talents+1 种基金Chinese Academy Science(CAS)Frontier Science Research Key Project for Top Young Scientist(No.QYZDB-SSW-SYS008)the Key Laboratory of Random Complex Structures and Data Science,CAS(No.2008DP173182).
文摘Nonnegative Matrix Factorization(NMF)is a powerful technique to perform dimension reduction and pattern recognition through single-layer data representation learning.However,deep learning networks,with their carefully designed hierarchical structure,can combine hidden features to form more representative features for pattern recognition.In this paper,we proposed sparse deep NMF models to analyze complex data for more accurate classification and better feature interpretation.Such models are designed to learn localized features or generate more discriminative representations for samples in distinct classes by imposing L1-norm penalty on the columns of certain factors.By extending a one-layer model into a multilayer model with sparsity,we provided a hierarchical way to analyze big data and intuitively extract hidden features due to nonnegativity.We adopted the Nesterov’s accelerated gradient algorithm to accelerate the computing process.We also analyzed the computing complexity of our frameworks to demonstrate their efficiency.To improve the performance of dealing with linearly inseparable data,we also considered to incorporate popular nonlinear functions into these frameworks and explored their performance.We applied our models using two benchmarking image datasets,and the results showed that our models can achieve competitive or better classification performance and produce intuitive interpretations compared with the typical NMF and competing multilayer models.
文摘This study was on superiority of the non- negative matrix factorization(NMF) algorithm for application of information extracted with aerial images.First,NMF was used for aerial image information extraction,and then this data was compared with a principal component analysis(PCA) in which r(the number of rows or columns of basic matrix) and E<sub>ignum</sub>(the number of eigenvalues) were given different values.Experimental results showed that the run time of NMF with r = 20 or 50 was less than that of PCA with an E<sub>ignum</sub> = 20 or 50.Also,the recognition rate of NMF with r = 50 was higher than that of an E<sub>ignum</sub> = 50.The experiment showed that nonnegative matrix factorization had advantages of a short time period with a high recognition rate.