Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize cl...Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.展开更多
Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced ...Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.展开更多
To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Con...To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.展开更多
A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive...A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.展开更多
Based on Multi-Masking Empirical Mode Decomposition (MMEMD) and fuzzy c-means (FCM) clustering, a new method of wind turbine bearing fault diagnosis FCM-MMEMD is proposed, which can determine the fault accurately and ...Based on Multi-Masking Empirical Mode Decomposition (MMEMD) and fuzzy c-means (FCM) clustering, a new method of wind turbine bearing fault diagnosis FCM-MMEMD is proposed, which can determine the fault accurately and timely. First, FCM clustering is employed to classify the data into different clusters, which helps to estimate whether there is a fault and how many fault types there are. If fault signals exist, the fault vibration signals are then demodulated and decomposed into different frequency bands by MMEMD in order to be analyzed further. In order to overcome the mode mixing defect of empirical mode decomposition (EMD), a novel method called MMEMD is proposed. It is an improvement to masking empirical mode decomposition (MEMD). By adding multi-masking signals to the signals to be decomposed in different levels, it can restrain low-frequency components from mixing in highfrequency components effectively in the sifting process and then suppress the mode mixing. It has the advantages of easy implementation and strong ability of suppressing modal mixing. The fault type is determined by Hilbert envelope finally. The results of simulation signal decomposition showed the high performance of MMEMD. Experiments of bearing fault diagnosis in wind turbine bearing fault diagnosis proved the validity and high accuracy of the new method.展开更多
To solve the problem of poor anti-noise performance of the traditional fuzzy C-means (FCM) algorithm in image segmentation, a novel two-dimensional FCM clustering algorithm for image segmentation was proposed. In this...To solve the problem of poor anti-noise performance of the traditional fuzzy C-means (FCM) algorithm in image segmentation, a novel two-dimensional FCM clustering algorithm for image segmentation was proposed. In this method, the image segmentation was converted into an optimization problem. The fitness function containing neighbor information was set up based on the gray information and the neighbor relations between the pixels described by the improved two-dimensional histogram. By making use of the global searching ability of the predator-prey particle swarm optimization, the optimal cluster center could be obtained by iterative optimization, and the image segmentation could be accomplished. The simulation results show that the segmentation accuracy ratio of the proposed method is above 99%. The proposed algorithm has strong anti-noise capability, high clustering accuracy and good segment effect, indicating that it is an effective algorithm for image segmentation.展开更多
Determining the relatively similar hydrological properties of the watersheds is very crucial in order to readily classify them for management practices such as flood and soil erosion control. This study aimed to ident...Determining the relatively similar hydrological properties of the watersheds is very crucial in order to readily classify them for management practices such as flood and soil erosion control. This study aimed to identify homogeneous hydrological watersheds using remote sensing data in western Iran. To achieve this goal, remote sensing indices including SAVI, LAI, NDMI, NDVI and snow cover, were extracted from MODIS data over the period 2000 to 2015. Then, a fuzzy method was used to clustering the watersheds based on the extracted indices. A fuzzy c-mean(FCM) algorithm enabled to classify 38 watersheds in three homogeneous groups.The optimal number of clusters was determined through evaluation of partition coefficient, partition entropy function and trial and error. The results indicated three homogeneous regions identified by the fuzzy c-mean clustering and remote sensing product which are consistent with the variations of topography and climate of the study area. Inherently,the grouped watersheds have similar hydrological properties and are likely to need similar management considerations and measures.展开更多
Suppressed fuzzy c-means (S-FCM) clustering algorithm with the intention of combining the higher speed of hard c-means clustering algorithm and the better classification performance of fuzzy c-means clustering algorit...Suppressed fuzzy c-means (S-FCM) clustering algorithm with the intention of combining the higher speed of hard c-means clustering algorithm and the better classification performance of fuzzy c-means clustering algorithm had been studied by many researchers and applied in many fields. In the algorithm, how to select the suppressed rate is a key step. In this paper, we give a method to select the fixed suppressed rate by the structure of the data itself. The experimental results show that the proposed method is a suitable way to select the suppressed rate in suppressed fuzzy c-means clustering algorithm.展开更多
Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vi...Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vision lossin diabetic patients.Today’s development in science has no medication to cureDiabetic Retinopathy.However,if diagnosed at an early stage it can be controlledand permanent vision loss can be avoided.Compared to the diabetic population,experts to diagnose Diabetic Retinopathy are very less in particular to local areas.Hence an automatic computer-aided diagnosis for DR detection is necessary.Inthis paper,we propose an unsupervised clustering technique to automatically clusterthe DR into one of its five development stages.The deep learning based unsupervisedclustering is made to improve itself with the help of fuzzy rough c-meansclustering where cluster centers are updated by fuzzy rough c-means clusteringalgorithm during the forward pass and the deep learning model representationsare updated by Stochastic Gradient Descent during the backward pass of training.The proposed method was implemented using python and the results were takenon DGX server with Tesla V100 GPU cards.An experimental result on the publicallyavailable Kaggle dataset shows an overall accuracy of 88.7%.The proposedmodel improves the accuracy of DR diagnosis compared to the existingunsupervised algorithms like k-means,FCM,auto-encoder,and FRCM withalexnet.展开更多
Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the ...Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the prototype of each cluster. By integrating feature weights, a formula for weight calculation is introduced to the clustering algorithm. The selection of weight exponent is crucial for good result and the weights are updated iteratively with each partition of clusters. The convergence of the weighted algorithms is given, and the feasible cluster validity indices of data mining application are utilized. Experimental results on both synthetic and real-life numerical data with different feature weights demonstrate that the weighted algorithm is better than the other unweighted algorithms.展开更多
Minimally Invasive Spine surgery (MISS) was developed to treat disorders of the spine with less disruption to the muscles. Surgeons use CT images to monitor the volume of muscles after operation in order to evaluate t...Minimally Invasive Spine surgery (MISS) was developed to treat disorders of the spine with less disruption to the muscles. Surgeons use CT images to monitor the volume of muscles after operation in order to evaluate the progress of patient recovery. The first step in the task is to segment the muscle regions from other tissues/organs in CT images. However, manual segmentation of muscle regions is not only inaccurate, but also time consuming. In this work, Gray Space Map (GSM) is used in fuzzy c-means clustering algorithm to segment muscle regions in CT images. GSM com- bines both spatial and intensity information of pixels. Experiments show that the proposed GSM- based fuzzy c-means clustering muscle CT image segmentation yields very good results.展开更多
A novel example-based process for Automated Colorization of grayscale images using Texture Descriptors (ACTD) without any human intervention is proposed. By analyzing a set of sample color images, coherent regions of ...A novel example-based process for Automated Colorization of grayscale images using Texture Descriptors (ACTD) without any human intervention is proposed. By analyzing a set of sample color images, coherent regions of homogeneous textures are extracted. A multi-channel filtering technique is used for texture-based image segmentation, combined with a modified Fuzzy C-means (FCM) clustering algorithm. This modified FCM clustering algorithm includes both the local spatial information from neighboring pixels, and the spatial Euclidian distance to the cluster’s center of gravity. For each area of interest, state-of-the-art texture descriptors are then computed and stored, along with corresponding color information. These texture descriptors and the color information are used for colorization of a grayscale image with similar textures. Given a grayscale image to be colorized, the segmentation and feature extraction processes are repeated. The texture descriptors are used to perform Content-Based Image Retrieval (CBIR). The colorization process is performed by Chroma replacement. This research finds numerous applications, ranging from classic film restoration and enhancement, to adding valuable information into medical and satellite imaging. Also, this can be used to enhance the detection of objects from x-ray images at the airports.展开更多
Classifying the data into a meaningful group is one of the fundamental ways of understanding and learning the valuable information. High-quality clustering methods are necessary for the valuable and efficient analysis...Classifying the data into a meaningful group is one of the fundamental ways of understanding and learning the valuable information. High-quality clustering methods are necessary for the valuable and efficient analysis of the increasing data. The Firefly Algorithm (FA) is one of the bio-inspired algorithms and it is recently used to solve the clustering problems. In this paper, Hybrid F-Firefly algorithm is developed by combining the Fuzzy C-Means (FCM) with FA to improve the clustering accuracy with global optimum solution. The Hybrid F-Firefly algorithm is developed by incorporating FCM operator at the end of each iteration in FA algorithm. This proposed algorithm is designed to utilize the goodness of existing algorithm and to enhance the original FA algorithm by solving the shortcomings in the FCM algorithm like the trapping in local optima and sensitive to initial seed points. In this research work, the Hybrid F-Firefly algorithm is implemented and experimentally tested for various performance measures under six different benchmark datasets. From the experimental results, it is observed that the Hybrid F-Firefly algorithm significantly improves the intra-cluster distance when compared with the existing algorithms like K-means, FCM and FA algorithm.展开更多
Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have ...Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed.展开更多
Studying user electricity consumption behavior is crucial for understanding their power usage patterns.However,the traditional clustering methods fail to identify emerging types of electricity consumption behavior.To ...Studying user electricity consumption behavior is crucial for understanding their power usage patterns.However,the traditional clustering methods fail to identify emerging types of electricity consumption behavior.To address this issue,this paper introduces a statistical analysis of clusters and evaluates the set of indicators for power usage patterns.The fuzzy C-means clustering algorithm is then used to analyze 6 months of electricity consumption data in 2017 from energy storage equipment,agricultural drainage irrigation,port shore power,and electric vehicles.Finally,the proposed method is validated through experiments,where the Davies-Bouldin index and profile coefficient are calculated and compared.Experiments showed that the optimal number of clusters is 4.This study demonstrates the potential of using a fuzzy C-means clustering algorithmin identifying emerging types of electricity consumption behavior,which can help power system operators and policymakers to make informed decisions and improve energy efficiency.展开更多
An advanced fuzzy C-mean (FCM) algorithm was proposed for the efficient regional clustering of multi-nodes interconnected systems. Due to various locational prices and regional coherencies for each node and point, m...An advanced fuzzy C-mean (FCM) algorithm was proposed for the efficient regional clustering of multi-nodes interconnected systems. Due to various locational prices and regional coherencies for each node and point, modified similarity measure was considered to gather nodes having similar characteristics. The similarity measure was needed to contain locafi0nal prices as well as regional coherency. In order to consider the two properties simultaneously, distance measure of fuzzy C-mean algorithm had to be modified. Regional clustering algorithm for interconnected power systems was designed based on the modified fuzzy C-mean algorithm. The proposed algorithm produces proper classification for the interconnected power system and the results are demonstrated in the example of IEEE 39-bus interconnected electricity system.展开更多
Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achiev...Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achieving higher clustering accuracy, which few studies have investigated. Built upon two existing methods on selecting fuzzifier, we developed an integrated fuzzifier evaluation and selection algorithm and tested it using real datasets. Our findings indicate that the consistent optimal number of clusters can be learnt from testing different fuzzifiers for each dataset and the fuzzifier with the lowest value for this consistency should be selected for clustering. Our evaluation also shows that the fuzzifier impacts the clustering accuracy. For longitudinal data with missing values, m = 2 could be an empirical rule to start fuzzy clustering, and the best clustering accuracy was achieved for tested data, especially using our multiple-imputation based fuzzy clustering.展开更多
This paper presents a fuzzy C- means clustering image segmentation algorithm based on particle swarm optimization, the method utilizes the strong search ability of particle swarm clustering search center. Because the ...This paper presents a fuzzy C- means clustering image segmentation algorithm based on particle swarm optimization, the method utilizes the strong search ability of particle swarm clustering search center. Because the search clustering center has small amount of calculation according to density, so it can greatly improve the calculation speed of fuzzy C- means algorithm. The experimental results show that, this method can make the fuzzy clustering to obviously improve the speed, so it can achieve fast image segmentation.展开更多
Sequence analysis technology under big data provides unprecedented opportunities for modern life science. A novel gene coding sequence identification method is proposed in this paper. Firstly, an improved short-time F...Sequence analysis technology under big data provides unprecedented opportunities for modern life science. A novel gene coding sequence identification method is proposed in this paper. Firstly, an improved short-time Fourier transform algorithm based on Morlet wavelet is applied to extract the power spectrum of DNA sequence. Then, threshold value determination method based on kernel fuzzy C-mean clustering is used to combine Signal to Noise Ratio (SNR) data of exon and intron into a sequence, classify the sequence into two types, calculate the weighted sum of two SNR clustering centers obtained and the discrimination threshold value. Finally, exon interval endpoint identification algorithm based on Takagi-Sugeno fuzzy identification model is presented to train Takagi-Sugeno model, optimize model parameters with Levenberg-Marquardt least square method, complete model and determine fuzzy rule. To verify the effectiveness of the proposed method, example tests are conducted on typical gene sequence sample data.展开更多
The premise and basis of load modeling are substation load composition inquiries and cluster analyses.However,the traditional kernel fuzzy C-means(KFCM)algorithm is limited by artificial clustering number selection an...The premise and basis of load modeling are substation load composition inquiries and cluster analyses.However,the traditional kernel fuzzy C-means(KFCM)algorithm is limited by artificial clustering number selection and its convergence to local optimal solutions.To overcome these limitations,an improved KFCM algorithm with adaptive optimal clustering number selection is proposed in this paper.This algorithm optimizes the KFCM algorithm by combining the powerful global search ability of genetic algorithm and the robust local search ability of simulated annealing algorithm.The improved KFCM algorithm adaptively determines the ideal number of clusters using the clustering evaluation index ratio.Compared with the traditional KFCM algorithm,the enhanced KFCM algorithm has robust clustering and comprehensive abilities,enabling the efficient convergence to the global optimal solution.展开更多
基金This research is funded by Graduate University of Science and Technology under grant number GUST.STS.DT2020-TT01。
文摘Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.
基金The National Natural Science Foundation of China(No.62262011)the Natural Science Foundation of Guangxi(No.2021JJA170130).
文摘Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.
基金The National Natural Science Foundation of China(No60672056)Open Fund of MOE-MS Key Laboratory of Multime-dia Computing and Communication(No06120809)
文摘To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.
文摘A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.
基金Supported by National Key R&D Projects(Grant No.2018YFB0905500)National Natural Science Foundation of China(Grant No.51875498)+1 种基金Hebei Provincial Natural Science Foundation of China(Grant Nos.E2018203439,E2018203339,F2016203496)Key Scientific Research Projects Plan of Henan Higher Education Institutions(Grant No.19B460001)
文摘Based on Multi-Masking Empirical Mode Decomposition (MMEMD) and fuzzy c-means (FCM) clustering, a new method of wind turbine bearing fault diagnosis FCM-MMEMD is proposed, which can determine the fault accurately and timely. First, FCM clustering is employed to classify the data into different clusters, which helps to estimate whether there is a fault and how many fault types there are. If fault signals exist, the fault vibration signals are then demodulated and decomposed into different frequency bands by MMEMD in order to be analyzed further. In order to overcome the mode mixing defect of empirical mode decomposition (EMD), a novel method called MMEMD is proposed. It is an improvement to masking empirical mode decomposition (MEMD). By adding multi-masking signals to the signals to be decomposed in different levels, it can restrain low-frequency components from mixing in highfrequency components effectively in the sifting process and then suppress the mode mixing. It has the advantages of easy implementation and strong ability of suppressing modal mixing. The fault type is determined by Hilbert envelope finally. The results of simulation signal decomposition showed the high performance of MMEMD. Experiments of bearing fault diagnosis in wind turbine bearing fault diagnosis proved the validity and high accuracy of the new method.
基金Project(06JJ50110) supported by the Natural Science Foundation of Hunan Province, China
文摘To solve the problem of poor anti-noise performance of the traditional fuzzy C-means (FCM) algorithm in image segmentation, a novel two-dimensional FCM clustering algorithm for image segmentation was proposed. In this method, the image segmentation was converted into an optimization problem. The fitness function containing neighbor information was set up based on the gray information and the neighbor relations between the pixels described by the improved two-dimensional histogram. By making use of the global searching ability of the predator-prey particle swarm optimization, the optimal cluster center could be obtained by iterative optimization, and the image segmentation could be accomplished. The simulation results show that the segmentation accuracy ratio of the proposed method is above 99%. The proposed algorithm has strong anti-noise capability, high clustering accuracy and good segment effect, indicating that it is an effective algorithm for image segmentation.
文摘Determining the relatively similar hydrological properties of the watersheds is very crucial in order to readily classify them for management practices such as flood and soil erosion control. This study aimed to identify homogeneous hydrological watersheds using remote sensing data in western Iran. To achieve this goal, remote sensing indices including SAVI, LAI, NDMI, NDVI and snow cover, were extracted from MODIS data over the period 2000 to 2015. Then, a fuzzy method was used to clustering the watersheds based on the extracted indices. A fuzzy c-mean(FCM) algorithm enabled to classify 38 watersheds in three homogeneous groups.The optimal number of clusters was determined through evaluation of partition coefficient, partition entropy function and trial and error. The results indicated three homogeneous regions identified by the fuzzy c-mean clustering and remote sensing product which are consistent with the variations of topography and climate of the study area. Inherently,the grouped watersheds have similar hydrological properties and are likely to need similar management considerations and measures.
文摘Suppressed fuzzy c-means (S-FCM) clustering algorithm with the intention of combining the higher speed of hard c-means clustering algorithm and the better classification performance of fuzzy c-means clustering algorithm had been studied by many researchers and applied in many fields. In the algorithm, how to select the suppressed rate is a key step. In this paper, we give a method to select the fixed suppressed rate by the structure of the data itself. The experimental results show that the proposed method is a suitable way to select the suppressed rate in suppressed fuzzy c-means clustering algorithm.
文摘Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vision lossin diabetic patients.Today’s development in science has no medication to cureDiabetic Retinopathy.However,if diagnosed at an early stage it can be controlledand permanent vision loss can be avoided.Compared to the diabetic population,experts to diagnose Diabetic Retinopathy are very less in particular to local areas.Hence an automatic computer-aided diagnosis for DR detection is necessary.Inthis paper,we propose an unsupervised clustering technique to automatically clusterthe DR into one of its five development stages.The deep learning based unsupervisedclustering is made to improve itself with the help of fuzzy rough c-meansclustering where cluster centers are updated by fuzzy rough c-means clusteringalgorithm during the forward pass and the deep learning model representationsare updated by Stochastic Gradient Descent during the backward pass of training.The proposed method was implemented using python and the results were takenon DGX server with Tesla V100 GPU cards.An experimental result on the publicallyavailable Kaggle dataset shows an overall accuracy of 88.7%.The proposedmodel improves the accuracy of DR diagnosis compared to the existingunsupervised algorithms like k-means,FCM,auto-encoder,and FRCM withalexnet.
基金Supported by the National Natural Science Foundation of China(61139002)~~
文摘Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the prototype of each cluster. By integrating feature weights, a formula for weight calculation is introduced to the clustering algorithm. The selection of weight exponent is crucial for good result and the weights are updated iteratively with each partition of clusters. The convergence of the weighted algorithms is given, and the feasible cluster validity indices of data mining application are utilized. Experimental results on both synthetic and real-life numerical data with different feature weights demonstrate that the weighted algorithm is better than the other unweighted algorithms.
文摘Minimally Invasive Spine surgery (MISS) was developed to treat disorders of the spine with less disruption to the muscles. Surgeons use CT images to monitor the volume of muscles after operation in order to evaluate the progress of patient recovery. The first step in the task is to segment the muscle regions from other tissues/organs in CT images. However, manual segmentation of muscle regions is not only inaccurate, but also time consuming. In this work, Gray Space Map (GSM) is used in fuzzy c-means clustering algorithm to segment muscle regions in CT images. GSM com- bines both spatial and intensity information of pixels. Experiments show that the proposed GSM- based fuzzy c-means clustering muscle CT image segmentation yields very good results.
文摘A novel example-based process for Automated Colorization of grayscale images using Texture Descriptors (ACTD) without any human intervention is proposed. By analyzing a set of sample color images, coherent regions of homogeneous textures are extracted. A multi-channel filtering technique is used for texture-based image segmentation, combined with a modified Fuzzy C-means (FCM) clustering algorithm. This modified FCM clustering algorithm includes both the local spatial information from neighboring pixels, and the spatial Euclidian distance to the cluster’s center of gravity. For each area of interest, state-of-the-art texture descriptors are then computed and stored, along with corresponding color information. These texture descriptors and the color information are used for colorization of a grayscale image with similar textures. Given a grayscale image to be colorized, the segmentation and feature extraction processes are repeated. The texture descriptors are used to perform Content-Based Image Retrieval (CBIR). The colorization process is performed by Chroma replacement. This research finds numerous applications, ranging from classic film restoration and enhancement, to adding valuable information into medical and satellite imaging. Also, this can be used to enhance the detection of objects from x-ray images at the airports.
文摘Classifying the data into a meaningful group is one of the fundamental ways of understanding and learning the valuable information. High-quality clustering methods are necessary for the valuable and efficient analysis of the increasing data. The Firefly Algorithm (FA) is one of the bio-inspired algorithms and it is recently used to solve the clustering problems. In this paper, Hybrid F-Firefly algorithm is developed by combining the Fuzzy C-Means (FCM) with FA to improve the clustering accuracy with global optimum solution. The Hybrid F-Firefly algorithm is developed by incorporating FCM operator at the end of each iteration in FA algorithm. This proposed algorithm is designed to utilize the goodness of existing algorithm and to enhance the original FA algorithm by solving the shortcomings in the FCM algorithm like the trapping in local optima and sensitive to initial seed points. In this research work, the Hybrid F-Firefly algorithm is implemented and experimentally tested for various performance measures under six different benchmark datasets. From the experimental results, it is observed that the Hybrid F-Firefly algorithm significantly improves the intra-cluster distance when compared with the existing algorithms like K-means, FCM and FA algorithm.
基金supported by the National Key Research and Development Program of China(No.2022YFB3304400)the National Natural Science Foundation of China(Nos.6230311,62303111,62076060,61932007,and 62176083)the Key Research and Development Program of Jiangsu Province of China(No.BE2022157).
文摘Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed.
基金supported by the Science and Technology Project of State Grid Jiangxi Electric Power Corporation Limited‘Research on Key Technologies for Non-Intrusive Load Identification for Typical Power Industry Users in Jiangxi Province’(521852220004)。
文摘Studying user electricity consumption behavior is crucial for understanding their power usage patterns.However,the traditional clustering methods fail to identify emerging types of electricity consumption behavior.To address this issue,this paper introduces a statistical analysis of clusters and evaluates the set of indicators for power usage patterns.The fuzzy C-means clustering algorithm is then used to analyze 6 months of electricity consumption data in 2017 from energy storage equipment,agricultural drainage irrigation,port shore power,and electric vehicles.Finally,the proposed method is validated through experiments,where the Davies-Bouldin index and profile coefficient are calculated and compared.Experiments showed that the optimal number of clusters is 4.This study demonstrates the potential of using a fuzzy C-means clustering algorithmin identifying emerging types of electricity consumption behavior,which can help power system operators and policymakers to make informed decisions and improve energy efficiency.
基金Work supported by the Second Stage of Brain Korea 21 ProjectsWork(2010-0020163) supported by Priority Research Centers Program through the National Research Foundation (NRF) funded by the Ministry of Education,Science and Technology of Korea
文摘An advanced fuzzy C-mean (FCM) algorithm was proposed for the efficient regional clustering of multi-nodes interconnected systems. Due to various locational prices and regional coherencies for each node and point, modified similarity measure was considered to gather nodes having similar characteristics. The similarity measure was needed to contain locafi0nal prices as well as regional coherency. In order to consider the two properties simultaneously, distance measure of fuzzy C-mean algorithm had to be modified. Regional clustering algorithm for interconnected power systems was designed based on the modified fuzzy C-mean algorithm. The proposed algorithm produces proper classification for the interconnected power system and the results are demonstrated in the example of IEEE 39-bus interconnected electricity system.
文摘Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achieving higher clustering accuracy, which few studies have investigated. Built upon two existing methods on selecting fuzzifier, we developed an integrated fuzzifier evaluation and selection algorithm and tested it using real datasets. Our findings indicate that the consistent optimal number of clusters can be learnt from testing different fuzzifiers for each dataset and the fuzzifier with the lowest value for this consistency should be selected for clustering. Our evaluation also shows that the fuzzifier impacts the clustering accuracy. For longitudinal data with missing values, m = 2 could be an empirical rule to start fuzzy clustering, and the best clustering accuracy was achieved for tested data, especially using our multiple-imputation based fuzzy clustering.
文摘This paper presents a fuzzy C- means clustering image segmentation algorithm based on particle swarm optimization, the method utilizes the strong search ability of particle swarm clustering search center. Because the search clustering center has small amount of calculation according to density, so it can greatly improve the calculation speed of fuzzy C- means algorithm. The experimental results show that, this method can make the fuzzy clustering to obviously improve the speed, so it can achieve fast image segmentation.
文摘Sequence analysis technology under big data provides unprecedented opportunities for modern life science. A novel gene coding sequence identification method is proposed in this paper. Firstly, an improved short-time Fourier transform algorithm based on Morlet wavelet is applied to extract the power spectrum of DNA sequence. Then, threshold value determination method based on kernel fuzzy C-mean clustering is used to combine Signal to Noise Ratio (SNR) data of exon and intron into a sequence, classify the sequence into two types, calculate the weighted sum of two SNR clustering centers obtained and the discrimination threshold value. Finally, exon interval endpoint identification algorithm based on Takagi-Sugeno fuzzy identification model is presented to train Takagi-Sugeno model, optimize model parameters with Levenberg-Marquardt least square method, complete model and determine fuzzy rule. To verify the effectiveness of the proposed method, example tests are conducted on typical gene sequence sample data.
基金supported by the Planning Special Project of Guangdong Power Grid Co.,Ltd.:“Study on load modeling based on total measurement and discrimination method suitable for system characteristic analysis and calculation during the implementation of target grid in Guangdong power grid”(0319002022030203JF00023).
文摘The premise and basis of load modeling are substation load composition inquiries and cluster analyses.However,the traditional kernel fuzzy C-means(KFCM)algorithm is limited by artificial clustering number selection and its convergence to local optimal solutions.To overcome these limitations,an improved KFCM algorithm with adaptive optimal clustering number selection is proposed in this paper.This algorithm optimizes the KFCM algorithm by combining the powerful global search ability of genetic algorithm and the robust local search ability of simulated annealing algorithm.The improved KFCM algorithm adaptively determines the ideal number of clusters using the clustering evaluation index ratio.Compared with the traditional KFCM algorithm,the enhanced KFCM algorithm has robust clustering and comprehensive abilities,enabling the efficient convergence to the global optimal solution.