Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing ...Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.展开更多
Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete da...Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete data is a critical yet challenging task.Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task,they may fail when data has a high value-missing rate,and they may easily fall into a local optimum.To address these problems,in this paper,we propose an absent multiple kernel clustering(AMKC)method on incomplete data.The AMKC method rst clusters the initialized incomplete data.Then,it constructs a new multiple-kernel-based data space,referred to as K-space,from multiple sources to learn kernel combination coefcients.Finally,it seamlessly integrates an incomplete-kernel-imputation objective,a multiple-kernel-learning objective,and a kernel-clustering objective in order to achieve absent multiple kernel clustering.The three stages in this process are carried out simultaneously until the convergence condition is met.Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is signicantly better than state-of-the-art competitors.Meanwhile,the proposed method gains fast convergence speed.展开更多
Multiple kernel clustering based on local kernel alignment has achieved outstanding clustering performance by applying local kernel alignment on each sample.However,we observe that most of existing works usually assum...Multiple kernel clustering based on local kernel alignment has achieved outstanding clustering performance by applying local kernel alignment on each sample.However,we observe that most of existing works usually assume that each local kernel alignment has the equal contribution to clustering performance,while local kernel alignment on different sample actually has different contribution to clustering performance.Therefore this assumption could have a negative effective on clustering performance.To solve this issue,we design a multiple kernel clustering algorithm based on self-weighted local kernel alignment,which can learn a proper weight to clustering performance for each local kernel alignment.Specifically,we introduce a new optimization variable-weight-to denote the contribution of each local kernel alignment to clustering performance,and then,weight,kernel combination coefficients and cluster membership are alternately optimized under kernel alignment frame.In addition,we develop a three-step alternate iterative optimization algorithm to address the resultant optimization problem.Broad experiments on five benchmark data sets have been put into effect to evaluate the clustering performance of the proposed algorithm.The experimental results distinctly demonstrate that the proposed algorithm outperforms the typical multiple kernel clustering algorithms,which illustrates the effectiveness of the proposed algorithm.展开更多
To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and ...To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and the GNC model which are based on Euclidean distance, the presented model is based on kernel-induced distance by using kernel method. By kernel method the input data are nonlinearly and implicitly mapped into a high-dimensional feature space, where the nonlinear pattern appears linear and the GNC algorithm is performed. It is unnecessary to calculate in high-dimensional feature space because the kernel function can do it just in input space. The effectiveness of the proposed algorithm is verified by experiments on three data sets. It is concluded that the KGNC algorithm has better clustering accuracy than FCM and GNC in clustering data sets containing noisy data.展开更多
A Recommender System(RS)is a crucial part of several firms,particularly those involved in e-commerce.In conventional RS,a user may only offer a single rating for an item-that is insufficient to perceive consumer prefe...A Recommender System(RS)is a crucial part of several firms,particularly those involved in e-commerce.In conventional RS,a user may only offer a single rating for an item-that is insufficient to perceive consumer preferences.Nowadays,businesses in industries like e-learning and tourism enable customers to rate a product using a variety of factors to comprehend customers’preferences.On the other hand,the collaborative filtering(CF)algorithm utilizing AutoEncoder(AE)is seen to be effective in identifying user-interested items.However,the cost of these computations increases nonlinearly as the number of items and users increases.To triumph over the issues,a novel expanded stacked autoencoder(ESAE)with Kernel Fuzzy C-Means Clustering(KFCM)technique is proposed with two phases.In the first phase of offline,the sparse multicriteria rating matrix is smoothened to a complete matrix by predicting the users’intact rating by the ESAE approach and users are clustered using the KFCM approach.In the next phase of online,the top-N recommendation prediction is made by the ESAE approach involving only the most similar user from multiple clusters.Hence the ESAE_KFCM model upgrades the prediction accuracy of 98.2%in Top-N recommendation with a minimized recommendation generation time.An experimental check on the Yahoo!Movies(YM)movie dataset and TripAdvisor(TA)travel dataset confirmed that the ESAE_KFCM model constantly outperforms conventional RS algorithms on a variety of assessment measures.展开更多
A clustering algorithm based on Sparse Projection (SP), called Sparse Projection Clus- tering (SPC), is proposed in this letter. The basic idea is applying SP to project the observed data onto a high-dimensional spars...A clustering algorithm based on Sparse Projection (SP), called Sparse Projection Clus- tering (SPC), is proposed in this letter. The basic idea is applying SP to project the observed data onto a high-dimensional sparse space, which is a nonlinear mapping with an explicit form and the K-means clustering algorithm can be therefore used to explore the inherent data patterns in the new space. The proposed algorithm is applied to cluster a complete artificial dataset and an incomplete real dataset. In comparison with the kernel K-means clustering algorithm, the proposed algorithm is more efficient.展开更多
Hydraulic fracturing (HF) technique has been extensively used for the exploitation of unconventional oiland gas reservoirs. HF enhances the connectivity of less permeable oil and gas-bearing rock formationsby fluid ...Hydraulic fracturing (HF) technique has been extensively used for the exploitation of unconventional oiland gas reservoirs. HF enhances the connectivity of less permeable oil and gas-bearing rock formationsby fluid injection, which creates an interconnected fracture network and increases the hydrocarbonproduction. Meanwhile, microseismic (MS) monitoring is one of the most effective approaches to evaluatesuch stimulation process. In this paper, the combined finite-discrete element method (FDEM) isadopted to numerically simulate HF and associated MS. Several post-processing tools, includingfrequency-magnitude distribution (b-value), fractal dimension (D-value), and seismic events clustering,are utilized to interpret numerical results. A non-parametric clustering algorithm designed specificallyfor FDEM is used to reduce the mesh dependency and extract more realistic seismic information.Simulation results indicated that at the local scale, the HF process tends to propagate following the rockmass discontinuities; while at the reservoir scale, it tends to develop in the direction parallel to themaximum in-situ stress. 2014 Institute of Rock and Soil Mechanics, Chinese Academy of Sciences. Production and hosting byElsevier B.V. All rights reserved.展开更多
A feature extraction for latent fault detection and failure modes classification method of board-level package subjected to vibration loadings is presented for prognostics and health management(PHM) of electronics usi...A feature extraction for latent fault detection and failure modes classification method of board-level package subjected to vibration loadings is presented for prognostics and health management(PHM) of electronics using adaptive spectrum kurtosis and kernel probability distance clustering. First, strain response data of electronic components is filtered by empirical mode decomposition(EMD) method based on maximum spectrum kurtosis(SK), and fault symptom vector is developed by computing and reconstructing the envelope spectrum. Second, nonlinear fault symptom data is mapped and clustered in sparse Hilbert space using Gaussian radial basis kernel probabilistic distance clustering method. Finally, the current state of board level package is estimated by computing the membership probability of its envelope spectrum. The experimental results demonstrated that the method can detect and classify the latent failure mode of board level package effectively before it happened.展开更多
基金Supported by the National Science Foundation(No.IIS-9988642)the Multidisciplinary Research Program
文摘Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.
基金funded by National Natural Science Foundation of China under Grant Nos.61972057 and U1836208Hunan Provincial Natural Science Foundation of China under Grant No.2019JJ50655+3 种基金Scientic Research Foundation of Hunan Provincial Education Department of China under Grant No.18B160Open Fund of Hunan Key Laboratory of Smart Roadway and Cooperative Vehicle Infrastructure Systems(Changsha University of Science and Technology)under Grant No.kfj180402the“Double First-class”International Cooperation and Development Scientic Research Project of Changsha University of Science and Technology under Grant No.2018IC25the Researchers Supporting Project No.(RSP-2020/102)King Saud University,Riyadh,Saudi Arabia.
文摘Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete data is a critical yet challenging task.Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task,they may fail when data has a high value-missing rate,and they may easily fall into a local optimum.To address these problems,in this paper,we propose an absent multiple kernel clustering(AMKC)method on incomplete data.The AMKC method rst clusters the initialized incomplete data.Then,it constructs a new multiple-kernel-based data space,referred to as K-space,from multiple sources to learn kernel combination coefcients.Finally,it seamlessly integrates an incomplete-kernel-imputation objective,a multiple-kernel-learning objective,and a kernel-clustering objective in order to achieve absent multiple kernel clustering.The three stages in this process are carried out simultaneously until the convergence condition is met.Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is signicantly better than state-of-the-art competitors.Meanwhile,the proposed method gains fast convergence speed.
基金This work was supported by the National Key R&D Program of China(No.2018YFB1003203)National Natural Science Foundation of China(Nos.61672528,61773392,61772561)+1 种基金Educational Commission of Hu Nan Province,China(No.14B193)the Key Research&Development Plan of Hunan Province(No.2018NK2012).
文摘Multiple kernel clustering based on local kernel alignment has achieved outstanding clustering performance by applying local kernel alignment on each sample.However,we observe that most of existing works usually assume that each local kernel alignment has the equal contribution to clustering performance,while local kernel alignment on different sample actually has different contribution to clustering performance.Therefore this assumption could have a negative effective on clustering performance.To solve this issue,we design a multiple kernel clustering algorithm based on self-weighted local kernel alignment,which can learn a proper weight to clustering performance for each local kernel alignment.Specifically,we introduce a new optimization variable-weight-to denote the contribution of each local kernel alignment to clustering performance,and then,weight,kernel combination coefficients and cluster membership are alternately optimized under kernel alignment frame.In addition,we develop a three-step alternate iterative optimization algorithm to address the resultant optimization problem.Broad experiments on five benchmark data sets have been put into effect to evaluate the clustering performance of the proposed algorithm.The experimental results distinctly demonstrate that the proposed algorithm outperforms the typical multiple kernel clustering algorithms,which illustrates the effectiveness of the proposed algorithm.
基金The 15th Plan National Defence Preven-tive Research Project (No.413030201)
文摘To deal with the nonlinear separable problem, the generalized noise clustering (GNC) algorithm is extended to a kernel generalized noise clustering (KGNC) model. Different from the fuzzy c-means (FCM) model and the GNC model which are based on Euclidean distance, the presented model is based on kernel-induced distance by using kernel method. By kernel method the input data are nonlinearly and implicitly mapped into a high-dimensional feature space, where the nonlinear pattern appears linear and the GNC algorithm is performed. It is unnecessary to calculate in high-dimensional feature space because the kernel function can do it just in input space. The effectiveness of the proposed algorithm is verified by experiments on three data sets. It is concluded that the KGNC algorithm has better clustering accuracy than FCM and GNC in clustering data sets containing noisy data.
文摘A Recommender System(RS)is a crucial part of several firms,particularly those involved in e-commerce.In conventional RS,a user may only offer a single rating for an item-that is insufficient to perceive consumer preferences.Nowadays,businesses in industries like e-learning and tourism enable customers to rate a product using a variety of factors to comprehend customers’preferences.On the other hand,the collaborative filtering(CF)algorithm utilizing AutoEncoder(AE)is seen to be effective in identifying user-interested items.However,the cost of these computations increases nonlinearly as the number of items and users increases.To triumph over the issues,a novel expanded stacked autoencoder(ESAE)with Kernel Fuzzy C-Means Clustering(KFCM)technique is proposed with two phases.In the first phase of offline,the sparse multicriteria rating matrix is smoothened to a complete matrix by predicting the users’intact rating by the ESAE approach and users are clustered using the KFCM approach.In the next phase of online,the top-N recommendation prediction is made by the ESAE approach involving only the most similar user from multiple clusters.Hence the ESAE_KFCM model upgrades the prediction accuracy of 98.2%in Top-N recommendation with a minimized recommendation generation time.An experimental check on the Yahoo!Movies(YM)movie dataset and TripAdvisor(TA)travel dataset confirmed that the ESAE_KFCM model constantly outperforms conventional RS algorithms on a variety of assessment measures.
基金Supported by the National Natural Science Foundation of China (No.60872123)the Joint Fund of the National Natural Science Foundation and the Guangdong Provin-cial Natural Science Foundation (No.U0835001)
文摘A clustering algorithm based on Sparse Projection (SP), called Sparse Projection Clus- tering (SPC), is proposed in this letter. The basic idea is applying SP to project the observed data onto a high-dimensional sparse space, which is a nonlinear mapping with an explicit form and the K-means clustering algorithm can be therefore used to explore the inherent data patterns in the new space. The proposed algorithm is applied to cluster a complete artificial dataset and an incomplete real dataset. In comparison with the kernel K-means clustering algorithm, the proposed algorithm is more efficient.
基金supported by the Natural Sciences and Engineering Research Council of Canada through Discovery Grant 341275 (G. Grasselli) and Engage EGP 461019-13
文摘Hydraulic fracturing (HF) technique has been extensively used for the exploitation of unconventional oiland gas reservoirs. HF enhances the connectivity of less permeable oil and gas-bearing rock formationsby fluid injection, which creates an interconnected fracture network and increases the hydrocarbonproduction. Meanwhile, microseismic (MS) monitoring is one of the most effective approaches to evaluatesuch stimulation process. In this paper, the combined finite-discrete element method (FDEM) isadopted to numerically simulate HF and associated MS. Several post-processing tools, includingfrequency-magnitude distribution (b-value), fractal dimension (D-value), and seismic events clustering,are utilized to interpret numerical results. A non-parametric clustering algorithm designed specificallyfor FDEM is used to reduce the mesh dependency and extract more realistic seismic information.Simulation results indicated that at the local scale, the HF process tends to propagate following the rockmass discontinuities; while at the reservoir scale, it tends to develop in the direction parallel to themaximum in-situ stress. 2014 Institute of Rock and Soil Mechanics, Chinese Academy of Sciences. Production and hosting byElsevier B.V. All rights reserved.
基金supported by the National Natural Science Foundation of China(Grant No.51201182)the Aeronautical Science Foundation of China(Grant No.20142896022)
文摘A feature extraction for latent fault detection and failure modes classification method of board-level package subjected to vibration loadings is presented for prognostics and health management(PHM) of electronics using adaptive spectrum kurtosis and kernel probability distance clustering. First, strain response data of electronic components is filtered by empirical mode decomposition(EMD) method based on maximum spectrum kurtosis(SK), and fault symptom vector is developed by computing and reconstructing the envelope spectrum. Second, nonlinear fault symptom data is mapped and clustered in sparse Hilbert space using Gaussian radial basis kernel probabilistic distance clustering method. Finally, the current state of board level package is estimated by computing the membership probability of its envelope spectrum. The experimental results demonstrated that the method can detect and classify the latent failure mode of board level package effectively before it happened.