Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete da...Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete data is a critical yet challenging task.Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task,they may fail when data has a high value-missing rate,and they may easily fall into a local optimum.To address these problems,in this paper,we propose an absent multiple kernel clustering(AMKC)method on incomplete data.The AMKC method rst clusters the initialized incomplete data.Then,it constructs a new multiple-kernel-based data space,referred to as K-space,from multiple sources to learn kernel combination coefcients.Finally,it seamlessly integrates an incomplete-kernel-imputation objective,a multiple-kernel-learning objective,and a kernel-clustering objective in order to achieve absent multiple kernel clustering.The three stages in this process are carried out simultaneously until the convergence condition is met.Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is signicantly better than state-of-the-art competitors.Meanwhile,the proposed method gains fast convergence speed.展开更多
The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, d...The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, data with noise, data with mixture of heterogeneous cluster prototypes, asymmetric data, etc. Based on the Mercer kernel, FKCM clustering algorithm is derived from FCM algorithm united with kernel method. The results of experiments with the synthetic and real data show that the FKCM clustering algorithm is universality and can effectively unsupervised analyze datasets with variform structures in contrast to FCM algorithm. It is can be imagined that kernel-based clustering algorithm is one of important research direction of fuzzy clustering analysis.展开更多
Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing ...Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.展开更多
To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and ...To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and the GNC model which are based on Euclidean distance, the presented model is based on kernel-induced distance by using kernel method. By kernel method the input data are nonlinearly and implicitly mapped into a high-dimensional feature space, where the nonlinear pattern appears linear and the GNC algorithm is performed. It is unnecessary to calculate in high-dimensional feature space because the kernel function can do it just in input space. The effectiveness of the proposed algorithm is verified by experiments on three data sets. It is concluded that the KGNC algorithm has better clustering accuracy than FCM and GNC in clustering data sets containing noisy data.展开更多
Multiple kernel clustering based on local kernel alignment has achieved outstanding clustering performance by applying local kernel alignment on each sample.However,we observe that most of existing works usually assum...Multiple kernel clustering based on local kernel alignment has achieved outstanding clustering performance by applying local kernel alignment on each sample.However,we observe that most of existing works usually assume that each local kernel alignment has the equal contribution to clustering performance,while local kernel alignment on different sample actually has different contribution to clustering performance.Therefore this assumption could have a negative effective on clustering performance.To solve this issue,we design a multiple kernel clustering algorithm based on self-weighted local kernel alignment,which can learn a proper weight to clustering performance for each local kernel alignment.Specifically,we introduce a new optimization variable-weight-to denote the contribution of each local kernel alignment to clustering performance,and then,weight,kernel combination coefficients and cluster membership are alternately optimized under kernel alignment frame.In addition,we develop a three-step alternate iterative optimization algorithm to address the resultant optimization problem.Broad experiments on five benchmark data sets have been put into effect to evaluate the clustering performance of the proposed algorithm.The experimental results distinctly demonstrate that the proposed algorithm outperforms the typical multiple kernel clustering algorithms,which illustrates the effectiveness of the proposed algorithm.展开更多
For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the o...For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.展开更多
A novel model of fuzzy clustering using kernel methods is proposed. This model is called kernel modified possibilistic c-means (KMPCM) model. The proposed model is an extension of the modified possibilistic c-means ...A novel model of fuzzy clustering using kernel methods is proposed. This model is called kernel modified possibilistic c-means (KMPCM) model. The proposed model is an extension of the modified possibilistic c-means (MPCM) algorithm by using kernel methods. Different from MPCM and fuzzy c-means (FCM) model which are based on Euclidean distance, the proposed model is based on kernel-induced distance. Furthermore, with kernel methods the input data can be mapped implicitly into a high-dimensional feature space where the nonlinear pattern now appears linear. It is unnecessary to do calculation in the high-dimensional feature space because the kernel function can do it. Numerical experiments show that KMPCM outperforms FCM and MPCM.展开更多
A new algorithm named kernel bisecting k-means and sample removal(KBK-SR) is proposed as sampling preprocessing for support vector machine(SVM) training to improve the efficiency.The proposed algorithm tends to quickl...A new algorithm named kernel bisecting k-means and sample removal(KBK-SR) is proposed as sampling preprocessing for support vector machine(SVM) training to improve the efficiency.The proposed algorithm tends to quickly produce balanced clusters of similar sizes in the kernel feature space,which makes it efficient and effective for reducing training samples.Theoretical analysis and experimental results on three UCI real data benchmarks both show that,with very short sampling time,the proposed algorithm dramatically accelerates SVM sampling and training while maintaining high test accuracy.展开更多
Over the last fifteen years, face recognition has become a popular area of research in image analysis and one of the most successful applications of machine learning and understanding. To enhance the classification ra...Over the last fifteen years, face recognition has become a popular area of research in image analysis and one of the most successful applications of machine learning and understanding. To enhance the classification rate of the image recognition, several techniques are introduced, modified and combined. The suggested model extracts the features using Fourier-Gabor filter, selects the best features using signal to noise ratio, deletes or modifies anomalous images using fuzzy c-mean clustering, uses kernel least square and optimizes it by using wild dog pack optimization. To compare the suggested method with the previous methods, four datasets are used. The results indicate that the suggested methods without fuzzy clustering and with fuzzy clustering outperform state- of-art methods for all datasets.展开更多
针对核密度估计载荷外推全局固定带宽的局限性,提出一种基于KANN-DBSCAN(K-average nearest neighbor density-based spatial clustering of applications with noise)改进带宽取值的核密度估计(kernel density estimation, KDE)载荷外...针对核密度估计载荷外推全局固定带宽的局限性,提出一种基于KANN-DBSCAN(K-average nearest neighbor density-based spatial clustering of applications with noise)改进带宽取值的核密度估计(kernel density estimation, KDE)载荷外推方法。通过KANN-DBSCAN聚类算法对载荷数据进行分组聚类,采用拇指法求得不同簇间的最优带宽,然后进行核密度估计,再采用蒙特卡洛模拟进行外推。以某电动汽车在用户道路的实测载荷数据为应用对象,对外推方法的合理性进行检验。从统计参数检验量、拟合度检验和伪损伤检验3个指标对外推效果进行评估。结果表明:相比固定带宽的核密度估计外推方法,基于KANN-DBSCSN核密度估计的外推方法获得的外推载荷在统计参数上与实测载荷更为接近,均值、标准差和最大值的误差分别仅为1.9%、 4.3%和1.9%;幅值累计频次曲线拟合度R2均大于0.99,伪损伤均接近1。结果验证了该聚类方法在核密度估计载荷外推的有效性,有助于编制汽车在用户道路上的载荷谱,为具有相似载荷分布特点的机械零部件载荷外推提供了参考。展开更多
基金funded by National Natural Science Foundation of China under Grant Nos.61972057 and U1836208Hunan Provincial Natural Science Foundation of China under Grant No.2019JJ50655+3 种基金Scientic Research Foundation of Hunan Provincial Education Department of China under Grant No.18B160Open Fund of Hunan Key Laboratory of Smart Roadway and Cooperative Vehicle Infrastructure Systems(Changsha University of Science and Technology)under Grant No.kfj180402the“Double First-class”International Cooperation and Development Scientic Research Project of Changsha University of Science and Technology under Grant No.2018IC25the Researchers Supporting Project No.(RSP-2020/102)King Saud University,Riyadh,Saudi Arabia.
文摘Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete data is a critical yet challenging task.Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task,they may fail when data has a high value-missing rate,and they may easily fall into a local optimum.To address these problems,in this paper,we propose an absent multiple kernel clustering(AMKC)method on incomplete data.The AMKC method rst clusters the initialized incomplete data.Then,it constructs a new multiple-kernel-based data space,referred to as K-space,from multiple sources to learn kernel combination coefcients.Finally,it seamlessly integrates an incomplete-kernel-imputation objective,a multiple-kernel-learning objective,and a kernel-clustering objective in order to achieve absent multiple kernel clustering.The three stages in this process are carried out simultaneously until the convergence condition is met.Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is signicantly better than state-of-the-art competitors.Meanwhile,the proposed method gains fast convergence speed.
文摘The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, data with noise, data with mixture of heterogeneous cluster prototypes, asymmetric data, etc. Based on the Mercer kernel, FKCM clustering algorithm is derived from FCM algorithm united with kernel method. The results of experiments with the synthetic and real data show that the FKCM clustering algorithm is universality and can effectively unsupervised analyze datasets with variform structures in contrast to FCM algorithm. It is can be imagined that kernel-based clustering algorithm is one of important research direction of fuzzy clustering analysis.
基金Supported by National Natural Science Foundation of China(60675039)National High Technology Research and Development Program of China(863 Program)(2006AA04Z217)Hundred Talents Program of Chinese Academy of Sciences
基金Supported by the National Science Foundation(No.IIS-9988642)the Multidisciplinary Research Program
文摘Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.
基金The 15th Plan National Defence Preven-tive Research Project (No.413030201)
文摘To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and the GNC model which are based on Euclidean distance, the presented model is based on kernel-induced distance by using kernel method. By kernel method the input data are nonlinearly and implicitly mapped into a high-dimensional feature space, where the nonlinear pattern appears linear and the GNC algorithm is performed. It is unnecessary to calculate in high-dimensional feature space because the kernel function can do it just in input space. The effectiveness of the proposed algorithm is verified by experiments on three data sets. It is concluded that the KGNC algorithm has better clustering accuracy than FCM and GNC in clustering data sets containing noisy data.
基金This work was supported by the National Key R&D Program of China(No.2018YFB1003203)National Natural Science Foundation of China(Nos.61672528,61773392,61772561)+1 种基金Educational Commission of Hu Nan Province,China(No.14B193)the Key Research&Development Plan of Hunan Province(No.2018NK2012).
文摘Multiple kernel clustering based on local kernel alignment has achieved outstanding clustering performance by applying local kernel alignment on each sample.However,we observe that most of existing works usually assume that each local kernel alignment has the equal contribution to clustering performance,while local kernel alignment on different sample actually has different contribution to clustering performance.Therefore this assumption could have a negative effective on clustering performance.To solve this issue,we design a multiple kernel clustering algorithm based on self-weighted local kernel alignment,which can learn a proper weight to clustering performance for each local kernel alignment.Specifically,we introduce a new optimization variable-weight-to denote the contribution of each local kernel alignment to clustering performance,and then,weight,kernel combination coefficients and cluster membership are alternately optimized under kernel alignment frame.In addition,we develop a three-step alternate iterative optimization algorithm to address the resultant optimization problem.Broad experiments on five benchmark data sets have been put into effect to evaluate the clustering performance of the proposed algorithm.The experimental results distinctly demonstrate that the proposed algorithm outperforms the typical multiple kernel clustering algorithms,which illustrates the effectiveness of the proposed algorithm.
文摘For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.
基金Project supported by the 15th Plan for National Defence Preventive Research Project (Grant No.413030201)
文摘A novel model of fuzzy clustering using kernel methods is proposed. This model is called kernel modified possibilistic c-means (KMPCM) model. The proposed model is an extension of the modified possibilistic c-means (MPCM) algorithm by using kernel methods. Different from MPCM and fuzzy c-means (FCM) model which are based on Euclidean distance, the proposed model is based on kernel-induced distance. Furthermore, with kernel methods the input data can be mapped implicitly into a high-dimensional feature space where the nonlinear pattern now appears linear. It is unnecessary to do calculation in the high-dimensional feature space because the kernel function can do it. Numerical experiments show that KMPCM outperforms FCM and MPCM.
基金National Natural Science Foundation of China (No. 60975083)Key Grant Project,Ministry of Education,China(No. 104145)
文摘A new algorithm named kernel bisecting k-means and sample removal(KBK-SR) is proposed as sampling preprocessing for support vector machine(SVM) training to improve the efficiency.The proposed algorithm tends to quickly produce balanced clusters of similar sizes in the kernel feature space,which makes it efficient and effective for reducing training samples.Theoretical analysis and experimental results on three UCI real data benchmarks both show that,with very short sampling time,the proposed algorithm dramatically accelerates SVM sampling and training while maintaining high test accuracy.
文摘Over the last fifteen years, face recognition has become a popular area of research in image analysis and one of the most successful applications of machine learning and understanding. To enhance the classification rate of the image recognition, several techniques are introduced, modified and combined. The suggested model extracts the features using Fourier-Gabor filter, selects the best features using signal to noise ratio, deletes or modifies anomalous images using fuzzy c-mean clustering, uses kernel least square and optimizes it by using wild dog pack optimization. To compare the suggested method with the previous methods, four datasets are used. The results indicate that the suggested methods without fuzzy clustering and with fuzzy clustering outperform state- of-art methods for all datasets.
文摘针对核密度估计载荷外推全局固定带宽的局限性,提出一种基于KANN-DBSCAN(K-average nearest neighbor density-based spatial clustering of applications with noise)改进带宽取值的核密度估计(kernel density estimation, KDE)载荷外推方法。通过KANN-DBSCAN聚类算法对载荷数据进行分组聚类,采用拇指法求得不同簇间的最优带宽,然后进行核密度估计,再采用蒙特卡洛模拟进行外推。以某电动汽车在用户道路的实测载荷数据为应用对象,对外推方法的合理性进行检验。从统计参数检验量、拟合度检验和伪损伤检验3个指标对外推效果进行评估。结果表明:相比固定带宽的核密度估计外推方法,基于KANN-DBSCSN核密度估计的外推方法获得的外推载荷在统计参数上与实测载荷更为接近,均值、标准差和最大值的误差分别仅为1.9%、 4.3%和1.9%;幅值累计频次曲线拟合度R2均大于0.99,伪损伤均接近1。结果验证了该聚类方法在核密度估计载荷外推的有效性,有助于编制汽车在用户道路上的载荷谱,为具有相似载荷分布特点的机械零部件载荷外推提供了参考。