Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can...Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.展开更多
Underwater direction of arrival(DOA)estimation has always been a very challenging theoretical and practical problem.Due to the serious non-stationary,non-linear,and non-Gaussian characteristics,machine learning based ...Underwater direction of arrival(DOA)estimation has always been a very challenging theoretical and practical problem.Due to the serious non-stationary,non-linear,and non-Gaussian characteristics,machine learning based DOA estimation methods trained on simulated Gaussian noised array data cannot be directly applied to actual underwater DOA estimation tasks.In order to deal with this problem,environmental data with no target echoes can be employed to analyze the non-Gaussian components.Then,the obtained information about non-Gaussian components can be used to whiten the array data.Based on these considerations,a novel practical sonar array whitening method was proposed.Specifically,based on a weak assumption that the non-Gaussian components in adjacent patches with and without target echoes are almost the same,canonical cor-relation analysis(CCA)and non-negative matrix factorization(NMF)techniques are employed for whitening the array data.With the whitened array data,machine learning based DOA estimation models trained on simulated Gaussian noised datasets can be used to perform underwater DOA estimation tasks.Experimental results illustrated that,using actual underwater datasets for testing with known machine learning based DOA estimation models,accurate and robust DOA estimation performance can be achieved by using the proposed whitening method in different underwater con-ditions.展开更多
This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorizati...This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorization by total variation constraint and graph regularization.The main contributions of our work are the following.First,total variation is incorporated into NMF to control the diffusion speed.The purpose is to denoise in smooth regions and preserve features or details of the data in edge regions by using a diffusion coefficient based on gradient information.Second,we add graph regularization into NMF to reveal intrinsic geometry and structure information of features to enhance the discrimination power.Third,the multiplicative update rules and proof of convergence of the TV-GNMF algorithm are given.Experiments conducted on datasets show that the proposed TV-GNMF method outperforms other state-of-the-art algorithms.展开更多
This paper considers a problem of unsupervised spectral unmixing of hyperspectral data. Based on the Linear Mixing Model ( LMM), a new method under the framework of nonnegative matrix fac- torization (NMF) is prop...This paper considers a problem of unsupervised spectral unmixing of hyperspectral data. Based on the Linear Mixing Model ( LMM), a new method under the framework of nonnegative matrix fac- torization (NMF) is proposed, namely minimum distance constrained nonnegative matrix factoriza- tion (MDC-NMF). In this paper, firstly, a new regularization term, called endmember distance (ED) is considered, which is defined as the sum of the squared Euclidean distances from each end- member to their geometric center. Compared with the simplex volume, ED has better optimization properties and is conceptually intuitive. Secondly, a projected gradient (PG) scheme is adopted, and by the virtue of ED, in this scheme the optimal step size along the feasible descent direction can be calculated easily at each iteration. Thirdly, a finite step ( no more than the number of endmem- bers) terminated algorithm is used to project a point on the canonical simplex, by which the abun- dance nonnegative constraint and abundance sum-to-one constraint can be accurately satisfied in a light amount of computation. The experimental results, based on a set of synthetic data and real da- ta, demonstrate that, in the same running time, MDC-NMF outperforms several other similar meth- ods proposed recently.展开更多
随着互联网和面向服务技术的发展,一种新型的Web应用——Mashup服务,开始在互联网上流行并快速增长.如何在众多Mashup服务中找到高质量的服务,已经成为一个大家关注的热点问题.寻找功能相似的服务并进行聚类,能有效提升服务发现的精度...随着互联网和面向服务技术的发展,一种新型的Web应用——Mashup服务,开始在互联网上流行并快速增长.如何在众多Mashup服务中找到高质量的服务,已经成为一个大家关注的热点问题.寻找功能相似的服务并进行聚类,能有效提升服务发现的精度与效率.目前国内外主流方法为挖掘Mashup服务中隐含的功能信息,进一步采用特定聚类算法如K-means等进行聚类.然而Mashup服务文档通常为短文本,基于传统的挖掘算法如LDA无法有效处理短文本,导致聚类效果并不理想.针对这一问题,提出一种基于非负矩阵分解的TWE-NMF(nonnegative matrix factorization combining tags and word embedding)模型对Mashup服务进行主题建模.所提方法首先对Mashup服务规范化处理,其次采用一种基于改进的Gibbs采样的狄利克雷过程混合模型,自动估算主题的数量,随后将词嵌入和服务标签等信息与非负矩阵分解相结合,求解Mashup服务主题特征,并通过谱聚类算法将服务聚类.最后,对所提方法的性能进行了综合评价,实验结果表明,与现有的服务聚类方法相比,所提方法在准确率、召回率、F-measure、纯度和熵等评价指标方面都有显著提高.展开更多
Many problems in image representation and classification involve some form of dimensionality reduction. Nonnegative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially loc...Many problems in image representation and classification involve some form of dimensionality reduction. Nonnegative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially localized, partsbased subspace representation of objects. An improvement of the classical NMF by combining with Log-Gabor wavelets to enhance its part-based learning ability is presented. The new method with principal component analysis (PCA) and locally linear embedding (LIE) proposed recently in Science are compared. Finally, the new method to several real world datasets and achieve good performance in representation and classification is applied.展开更多
Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decompos...Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decomposition. The three-dimension nonnegative matrix factorization (NMF3) algorithm, which was concise and easy to implement, was given in this paper. The NMF3 algorithm implementation was based on elements but not on vectors. It could decompose a data array directly without unfolding, which was not similar to that the traditional algorithms do, It has been applied to the simulated data array decomposition and obtained reasonable results. It showed that NMF3 could be introduced for curve resolution in chemometrics.展开更多
行人检测在机器人、驾驶辅助系统和视频监控等领域有广泛的应用,该文提出一种基于显著性检测与方向梯度直方图-非负矩阵分解(Histogram of Oriented Gradient-Non-negative Matrix Factorization,HOG-NMF)特征的快速行人检测方法。采用...行人检测在机器人、驾驶辅助系统和视频监控等领域有广泛的应用,该文提出一种基于显著性检测与方向梯度直方图-非负矩阵分解(Histogram of Oriented Gradient-Non-negative Matrix Factorization,HOG-NMF)特征的快速行人检测方法。采用频谱调谐显著性检测提取显著图,并基于熵值门限进行感兴趣区域的提取;组合非负矩阵分解和方向梯度直方图生成HOG-NMF特征;采用加性交叉核支持向量机方法(Intersection Kernel Support Vector Machine,IKSVM)。该算法显著降低了特征维数,在相同的计算复杂度下明显改善了线性支持向量机的检测率。在INRIA数据库的实验结果表明,该方法对比HOG/线性SVM和HOG/RBF-SVM显著减少了检测时间,并达到了满意的检测率。展开更多
文摘Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.
基金supported by the National Natural Science Foundation of China(No.51279033).
文摘Underwater direction of arrival(DOA)estimation has always been a very challenging theoretical and practical problem.Due to the serious non-stationary,non-linear,and non-Gaussian characteristics,machine learning based DOA estimation methods trained on simulated Gaussian noised array data cannot be directly applied to actual underwater DOA estimation tasks.In order to deal with this problem,environmental data with no target echoes can be employed to analyze the non-Gaussian components.Then,the obtained information about non-Gaussian components can be used to whiten the array data.Based on these considerations,a novel practical sonar array whitening method was proposed.Specifically,based on a weak assumption that the non-Gaussian components in adjacent patches with and without target echoes are almost the same,canonical cor-relation analysis(CCA)and non-negative matrix factorization(NMF)techniques are employed for whitening the array data.With the whitened array data,machine learning based DOA estimation models trained on simulated Gaussian noised datasets can be used to perform underwater DOA estimation tasks.Experimental results illustrated that,using actual underwater datasets for testing with known machine learning based DOA estimation models,accurate and robust DOA estimation performance can be achieved by using the proposed whitening method in different underwater con-ditions.
基金supported by the National Natural Science Foundation of China(61702251,41971424,61701191,U1605254)the Natural Science Basic Research Plan in Shaanxi Province of China(2018JM6030)+4 种基金the Key Technical Project of Fujian Province(2017H6015)the Science and Technology Project of Xiamen(3502Z20183032)the Doctor Scientific Research Starting Foundation of Northwest University(338050050)Youth Academic Talent Support Program of Northwest University(360051900151)the Natural Sciences and Engineering Research Council of Canada,Canada。
文摘This paper presents a novel medical image registration algorithm named total variation constrained graphregularization for non-negative matrix factorization(TV-GNMF).The method utilizes non-negative matrix factorization by total variation constraint and graph regularization.The main contributions of our work are the following.First,total variation is incorporated into NMF to control the diffusion speed.The purpose is to denoise in smooth regions and preserve features or details of the data in edge regions by using a diffusion coefficient based on gradient information.Second,we add graph regularization into NMF to reveal intrinsic geometry and structure information of features to enhance the discrimination power.Third,the multiplicative update rules and proof of convergence of the TV-GNMF algorithm are given.Experiments conducted on datasets show that the proposed TV-GNMF method outperforms other state-of-the-art algorithms.
基金Supported by the National Natural Science Foundation of China ( No. 60872083 ) and the National High Technology Research and Development Program of China (No. 2007AA12Z149).
文摘This paper considers a problem of unsupervised spectral unmixing of hyperspectral data. Based on the Linear Mixing Model ( LMM), a new method under the framework of nonnegative matrix fac- torization (NMF) is proposed, namely minimum distance constrained nonnegative matrix factoriza- tion (MDC-NMF). In this paper, firstly, a new regularization term, called endmember distance (ED) is considered, which is defined as the sum of the squared Euclidean distances from each end- member to their geometric center. Compared with the simplex volume, ED has better optimization properties and is conceptually intuitive. Secondly, a projected gradient (PG) scheme is adopted, and by the virtue of ED, in this scheme the optimal step size along the feasible descent direction can be calculated easily at each iteration. Thirdly, a finite step ( no more than the number of endmem- bers) terminated algorithm is used to project a point on the canonical simplex, by which the abun- dance nonnegative constraint and abundance sum-to-one constraint can be accurately satisfied in a light amount of computation. The experimental results, based on a set of synthetic data and real da- ta, demonstrate that, in the same running time, MDC-NMF outperforms several other similar meth- ods proposed recently.
文摘随着互联网和面向服务技术的发展,一种新型的Web应用——Mashup服务,开始在互联网上流行并快速增长.如何在众多Mashup服务中找到高质量的服务,已经成为一个大家关注的热点问题.寻找功能相似的服务并进行聚类,能有效提升服务发现的精度与效率.目前国内外主流方法为挖掘Mashup服务中隐含的功能信息,进一步采用特定聚类算法如K-means等进行聚类.然而Mashup服务文档通常为短文本,基于传统的挖掘算法如LDA无法有效处理短文本,导致聚类效果并不理想.针对这一问题,提出一种基于非负矩阵分解的TWE-NMF(nonnegative matrix factorization combining tags and word embedding)模型对Mashup服务进行主题建模.所提方法首先对Mashup服务规范化处理,其次采用一种基于改进的Gibbs采样的狄利克雷过程混合模型,自动估算主题的数量,随后将词嵌入和服务标签等信息与非负矩阵分解相结合,求解Mashup服务主题特征,并通过谱聚类算法将服务聚类.最后,对所提方法的性能进行了综合评价,实验结果表明,与现有的服务聚类方法相比,所提方法在准确率、召回率、F-measure、纯度和熵等评价指标方面都有显著提高.
文摘Many problems in image representation and classification involve some form of dimensionality reduction. Nonnegative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially localized, partsbased subspace representation of objects. An improvement of the classical NMF by combining with Log-Gabor wavelets to enhance its part-based learning ability is presented. The new method with principal component analysis (PCA) and locally linear embedding (LIE) proposed recently in Science are compared. Finally, the new method to several real world datasets and achieve good performance in representation and classification is applied.
文摘Non-negative matrix factorization (NMF) is a technique for dimensionality reduction by placing non-negativity constraints on the matrix. Based on the PARAFAC model, NMF was extended for three-dimension data decomposition. The three-dimension nonnegative matrix factorization (NMF3) algorithm, which was concise and easy to implement, was given in this paper. The NMF3 algorithm implementation was based on elements but not on vectors. It could decompose a data array directly without unfolding, which was not similar to that the traditional algorithms do, It has been applied to the simulated data array decomposition and obtained reasonable results. It showed that NMF3 could be introduced for curve resolution in chemometrics.
文摘行人检测在机器人、驾驶辅助系统和视频监控等领域有广泛的应用,该文提出一种基于显著性检测与方向梯度直方图-非负矩阵分解(Histogram of Oriented Gradient-Non-negative Matrix Factorization,HOG-NMF)特征的快速行人检测方法。采用频谱调谐显著性检测提取显著图,并基于熵值门限进行感兴趣区域的提取;组合非负矩阵分解和方向梯度直方图生成HOG-NMF特征;采用加性交叉核支持向量机方法(Intersection Kernel Support Vector Machine,IKSVM)。该算法显著降低了特征维数,在相同的计算复杂度下明显改善了线性支持向量机的检测率。在INRIA数据库的实验结果表明,该方法对比HOG/线性SVM和HOG/RBF-SVM显著减少了检测时间,并达到了满意的检测率。