An improved fuzzy time series algorithmbased on clustering is designed in this paper.The algorithm is successfully applied to short-term load forecasting in the distribution stations.Firstly,the K-means clustering met...An improved fuzzy time series algorithmbased on clustering is designed in this paper.The algorithm is successfully applied to short-term load forecasting in the distribution stations.Firstly,the K-means clustering method is used to cluster the data,and the midpoint of two adjacent clustering centers is taken as the dividing point of domain division.On this basis,the data is fuzzed to form a fuzzy time series.Secondly,a high-order fuzzy relation with multiple antecedents is established according to the main measurement indexes of power load,which is used to predict the short-term trend change of load in the distribution stations.Matlab/Simulink simulation results show that the load forecasting errors of the typical fuzzy time series on the time scale of one day and one week are[−50,20]and[−50,30],while the load forecasting errors of the improved fuzzy time series on the time scale of one day and one week are[−20,15]and[−20,25].It shows that the fuzzy time series algorithm improved by clustering improves the prediction accuracy and can effectively predict the short-term load trend of distribution stations.展开更多
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vi...Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vision lossin diabetic patients.Today’s development in science has no medication to cureDiabetic Retinopathy.However,if diagnosed at an early stage it can be controlledand permanent vision loss can be avoided.Compared to the diabetic population,experts to diagnose Diabetic Retinopathy are very less in particular to local areas.Hence an automatic computer-aided diagnosis for DR detection is necessary.Inthis paper,we propose an unsupervised clustering technique to automatically clusterthe DR into one of its five development stages.The deep learning based unsupervisedclustering is made to improve itself with the help of fuzzy rough c-meansclustering where cluster centers are updated by fuzzy rough c-means clusteringalgorithm during the forward pass and the deep learning model representationsare updated by Stochastic Gradient Descent during the backward pass of training.The proposed method was implemented using python and the results were takenon DGX server with Tesla V100 GPU cards.An experimental result on the publicallyavailable Kaggle dataset shows an overall accuracy of 88.7%.The proposedmodel improves the accuracy of DR diagnosis compared to the existingunsupervised algorithms like k-means,FCM,auto-encoder,and FRCM withalexnet.展开更多
Clustering is an unsupervised learning method used to organize raw data in such a way that those with the same (similar) characteristics are found in the same class and those that are dissimilar are found in different...Clustering is an unsupervised learning method used to organize raw data in such a way that those with the same (similar) characteristics are found in the same class and those that are dissimilar are found in different classes. In this day and age, the very rapid increase in the amount of data being produced brings new challenges in the analysis and storage of this data. Recently, there is a growing interest in key areas such as real-time data mining, which reveal an urgent need to process very large data under strict performance constraints. The objective of this paper is to survey four algorithms including K-Means algorithm, FCM algorithm, EM algorithm and BIRCH, used for data clustering and then show their strengths and weaknesses. Another task is to compare the results obtained by applying each of these algorithms to the same data and to give a conclusion based on these results.展开更多
To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totali...To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totality sample space, two algorithms are proposed on the basis of the data analysis method in rough sets theory: information system discrete algorithm (algorithm 1) and samples representatives judging algorithm (algorithm 2). On the principle of the farthest distance, algorithm 1 transforms continuous data into discrete form which could be transacted by rough sets theory. Taking the approximate precision as a criterion, algorithm 2 chooses the sample space with a good representative. Hence, the clustering sample set in inducing and computing optimal dividing matrix can be achieved. Several theorems are proposed to provide strict theoretic foundations for the execution of the algorithm model. An applied example based on the new algorithm model is given, whose result verifies the feasibility of this new algorithm model.展开更多
The demand for individualized teaching from E-learning websites is rapidly increasing due to the huge differences existed among Web learners. A method for clustering Web learners based on rough set is proposed. The ba...The demand for individualized teaching from E-learning websites is rapidly increasing due to the huge differences existed among Web learners. A method for clustering Web learners based on rough set is proposed. The basic idea of the method is to reduce the learning attributes prior to clustering, and therefore the clustering of Web learners is carried out in a relative low-dimensional space. Using this method, the E-learning websites can arrange corresponding teaching content for different clusters of learners so that the learners’ individual requirements can be more satisfied. Key words rough set - attributes reduction - k-means clustering - individualized teaching CLC number TP 391.6 Foundation item: Supported by the National “863” Program of China (2002AA111010, 2003AA001032)Biography: LIU Shuai-dong (1979-), male, Master candidate, research direction: knowledge discovery and individualized learning techniques.展开更多
Classifying the data into a meaningful group is one of the fundamental ways of understanding and learning the valuable information. High-quality clustering methods are necessary for the valuable and efficient analysis...Classifying the data into a meaningful group is one of the fundamental ways of understanding and learning the valuable information. High-quality clustering methods are necessary for the valuable and efficient analysis of the increasing data. The Firefly Algorithm (FA) is one of the bio-inspired algorithms and it is recently used to solve the clustering problems. In this paper, Hybrid F-Firefly algorithm is developed by combining the Fuzzy C-Means (FCM) with FA to improve the clustering accuracy with global optimum solution. The Hybrid F-Firefly algorithm is developed by incorporating FCM operator at the end of each iteration in FA algorithm. This proposed algorithm is designed to utilize the goodness of existing algorithm and to enhance the original FA algorithm by solving the shortcomings in the FCM algorithm like the trapping in local optima and sensitive to initial seed points. In this research work, the Hybrid F-Firefly algorithm is implemented and experimentally tested for various performance measures under six different benchmark datasets. From the experimental results, it is observed that the Hybrid F-Firefly algorithm significantly improves the intra-cluster distance when compared with the existing algorithms like K-means, FCM and FA algorithm.展开更多
This paper uses Gaussian interval type-2 fuzzy se theory on historical traffic volume data processing to obtain a 24-hour prediction of traffic volume with high precision. A K-means clustering method is used in this p...This paper uses Gaussian interval type-2 fuzzy se theory on historical traffic volume data processing to obtain a 24-hour prediction of traffic volume with high precision. A K-means clustering method is used in this paper to get 5 minutes traffic volume variation as input data for the Gaussian interval type-2 fuzzy sets which can reflect the distribution of historical traffic volume in one statistical period. Moreover, the cluster with the largest collection of data obtained by K-means clustering method is calculated to get the key parameters of type-2 fuzzy sets, mean and standard deviation of the Gaussian membership function.Using the range of data as the input of Gaussian interval type-2 fuzzy sets leads to the range of traffic volume forecasting output with the ability of describing the possible range of the traffic volume as well as the traffic volume prediction data with high accuracy. The simulation results show that the average relative error is reduced to 8% based on the combined K-means Gaussian interval type-2 fuzzy sets forecasting method. The fluctuation range in terms of an upper and a lower forecasting traffic volume completely envelopes the actual traffic volume and reproduces the fluctuation range of traffic flow.展开更多
Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the respons...Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.展开更多
Based on rough similarity degree of rough sets and close degree of fuzzy sets, the definitions of rough similarity degree and rough close degree of rough fuzzy sets are given, which can be used to measure the similar ...Based on rough similarity degree of rough sets and close degree of fuzzy sets, the definitions of rough similarity degree and rough close degree of rough fuzzy sets are given, which can be used to measure the similar degree between two rough fuzzy sets. The properties and theorems are listed. Using the two new measures, the method of clustering in the rough fuzzy system can be obtained. After clustering, the new fuzzy sample can be recognized by the principle of maximal similarity degree.展开更多
In recent years, microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for disc...In recent years, microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for discovering groups of correlated genes potentially co-regulated or associated to the disease or conditions under investigation. Many clustering methods including k-means, fuzzy c-means, and hierarchical clustering have been widely used in literatures. Yet no comprehensive comparative study has been performed to evaluate the effectiveness of these methods, specially, in yeast saccharomyces cerevisiae. In this paper, these three gene clustering methods are compared. Classification accuracy and CPU time cost are employed for measuring performance of these algorithms. Our results show that hierarchical clustering outperforms k-means and fuzzy c-means clustering. The analysis provides deep insight to the complicated gene clustering problem of expression profile and serves as a practical guideline for routine microarray cluster analysis of gene expression.展开更多
Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities...Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities in intelligent data analyzing applications are mostly represented with the help of IF-THEN rules. With the help of these rules the following tasks are solved: prediction, classification, pattern recognition and others. Using different approaches---clustering algorithms, neural network methods, fuzzy rule processing methods--we can extract rules that in an understandable language characterize the data. This allows interpreting the data, finding relationships in the data and extracting new rules that characterize them. Knowledge acquisition in this paper is defined as the process of extracting knowledge from numerical data in the form of rules. Extraction of rules in this context is based on clustering methods K-means and fuzzy C-means. With the assistance of K-means, clustering algorithm rules are derived from trained neural networks. Fuzzy C-means is used in fuzzy rule based design method. Rule extraction methodology is demonstrated in the Fisher's Iris flower data set samples. The effectiveness of the extracted rules is evaluated. Clustering and rule extraction methodology can be widely used in evaluating and analyzing various economic and financial processes.展开更多
针对目前配电网用户负荷数据高维度时序数据特征提取难、交叉数据聚类处理难、负荷数据精准标签化难等问题,文章提出面向用户负荷数据的基于降噪自编码器和改进粗糙模糊K均值的特征提取与标签定义模型(feature extraction and label def...针对目前配电网用户负荷数据高维度时序数据特征提取难、交叉数据聚类处理难、负荷数据精准标签化难等问题,文章提出面向用户负荷数据的基于降噪自编码器和改进粗糙模糊K均值的特征提取与标签定义模型(feature extraction and label definition model based on DAE and improve RFKM,FLMbD-iR)。FLMbD-iR通过降噪自编码器对原始用户负荷数据进行深度特征提取后,利用基于类簇规模不均衡度量的粗糙模糊K均值进行聚类,处理聚类中簇间交叉数据存在误差的缺陷,最后构建描述指标对典型日负荷曲线进行标签定义。实验采用美国电力负荷数据进行仿真模拟,实验结果显示本方法在用户负荷数据聚类处理上效果显著。展开更多
It is difficult to measure the online values of biochemical oxygen demand(BOD) due to the characteristics of nonlinear dynamics, large lag and uncertainty in wastewater treatment process. In this paper, based on the k...It is difficult to measure the online values of biochemical oxygen demand(BOD) due to the characteristics of nonlinear dynamics, large lag and uncertainty in wastewater treatment process. In this paper, based on the knowledge representation ability and learning capability, an improved T–S fuzzy neural network(TSFNN) is introduced to predict BOD values by the soft computing method. In this improved TSFNN, a K-means clustering is used to initialize the structure of TSFNN, including the number of fuzzy rules and parameters of membership function. For training TSFNN, a gradient descent method with the momentum item is used to adjust antecedent parameters and consequent parameters. This improved TSFNN is applied to predict the BOD values in effluent of the wastewater treatment process. The simulation results show that the TSFNN with K-means clustering algorithm can measure the BOD values accurately. The algorithm presents better approximation performance than some other methods.展开更多
Partitional clustering techniques such as K-Means(KM),Fuzzy C-Means(FCM),and Rough K-Means(RKM)are very simple and effective techniques for image segmentation.But,because their initial cluster centers are randomly det...Partitional clustering techniques such as K-Means(KM),Fuzzy C-Means(FCM),and Rough K-Means(RKM)are very simple and effective techniques for image segmentation.But,because their initial cluster centers are randomly determined,it is often seen that certain clusters converge to local optima.In addition to that,pathology image segmentation is also problematic due to uneven lighting,stain,and camera settings during the microscopic image capturing process.Therefore,this study proposes an Improved Slime Mould Algorithm(ISMA)based on opposition based learning and differential evolution’s mutation strategy to perform illumination-free White Blood Cell(WBC)segmentation.The ISMA helps to overcome the local optima trapping problem of the partitional clustering techniques to some extent.This paper also performs a depth analysis by considering only color components of many well-known color spaces for clustering to find the effect of illumination over color pathology image clustering.Numerical and visual results encourage the utilization of illumination-free or color component-based clustering approaches for image segmentation.ISMA-KM and“ab”color channels of CIELab color space provide best results with above-99%accuracy for only nucleus segmentation.Whereas,for entire WBC segmentation,ISMA-KM and the“CbCr”color component of YCbCr color space provide the best results with an accuracy of above 99%.Furthermore,ISMA-KM and ISMA-RKM have the lowest and highest execution times,respectively.On the other hand,ISMA provides competitive outcomes over CEC2019 benchmark test functions compared to recent well-established and efficient Nature-Inspired Optimization Algorithms(NIOAs).展开更多
基金supported by the National Natural Science Foundation of China under Grant 51777193.
文摘An improved fuzzy time series algorithmbased on clustering is designed in this paper.The algorithm is successfully applied to short-term load forecasting in the distribution stations.Firstly,the K-means clustering method is used to cluster the data,and the midpoint of two adjacent clustering centers is taken as the dividing point of domain division.On this basis,the data is fuzzed to form a fuzzy time series.Secondly,a high-order fuzzy relation with multiple antecedents is established according to the main measurement indexes of power load,which is used to predict the short-term trend change of load in the distribution stations.Matlab/Simulink simulation results show that the load forecasting errors of the typical fuzzy time series on the time scale of one day and one week are[−50,20]and[−50,30],while the load forecasting errors of the improved fuzzy time series on the time scale of one day and one week are[−20,15]and[−20,25].It shows that the fuzzy time series algorithm improved by clustering improves the prediction accuracy and can effectively predict the short-term load trend of distribution stations.
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
文摘Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vision lossin diabetic patients.Today’s development in science has no medication to cureDiabetic Retinopathy.However,if diagnosed at an early stage it can be controlledand permanent vision loss can be avoided.Compared to the diabetic population,experts to diagnose Diabetic Retinopathy are very less in particular to local areas.Hence an automatic computer-aided diagnosis for DR detection is necessary.Inthis paper,we propose an unsupervised clustering technique to automatically clusterthe DR into one of its five development stages.The deep learning based unsupervisedclustering is made to improve itself with the help of fuzzy rough c-meansclustering where cluster centers are updated by fuzzy rough c-means clusteringalgorithm during the forward pass and the deep learning model representationsare updated by Stochastic Gradient Descent during the backward pass of training.The proposed method was implemented using python and the results were takenon DGX server with Tesla V100 GPU cards.An experimental result on the publicallyavailable Kaggle dataset shows an overall accuracy of 88.7%.The proposedmodel improves the accuracy of DR diagnosis compared to the existingunsupervised algorithms like k-means,FCM,auto-encoder,and FRCM withalexnet.
文摘Clustering is an unsupervised learning method used to organize raw data in such a way that those with the same (similar) characteristics are found in the same class and those that are dissimilar are found in different classes. In this day and age, the very rapid increase in the amount of data being produced brings new challenges in the analysis and storage of this data. Recently, there is a growing interest in key areas such as real-time data mining, which reveal an urgent need to process very large data under strict performance constraints. The objective of this paper is to survey four algorithms including K-Means algorithm, FCM algorithm, EM algorithm and BIRCH, used for data clustering and then show their strengths and weaknesses. Another task is to compare the results obtained by applying each of these algorithms to the same data and to give a conclusion based on these results.
文摘To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totality sample space, two algorithms are proposed on the basis of the data analysis method in rough sets theory: information system discrete algorithm (algorithm 1) and samples representatives judging algorithm (algorithm 2). On the principle of the farthest distance, algorithm 1 transforms continuous data into discrete form which could be transacted by rough sets theory. Taking the approximate precision as a criterion, algorithm 2 chooses the sample space with a good representative. Hence, the clustering sample set in inducing and computing optimal dividing matrix can be achieved. Several theorems are proposed to provide strict theoretic foundations for the execution of the algorithm model. An applied example based on the new algorithm model is given, whose result verifies the feasibility of this new algorithm model.
文摘The demand for individualized teaching from E-learning websites is rapidly increasing due to the huge differences existed among Web learners. A method for clustering Web learners based on rough set is proposed. The basic idea of the method is to reduce the learning attributes prior to clustering, and therefore the clustering of Web learners is carried out in a relative low-dimensional space. Using this method, the E-learning websites can arrange corresponding teaching content for different clusters of learners so that the learners’ individual requirements can be more satisfied. Key words rough set - attributes reduction - k-means clustering - individualized teaching CLC number TP 391.6 Foundation item: Supported by the National “863” Program of China (2002AA111010, 2003AA001032)Biography: LIU Shuai-dong (1979-), male, Master candidate, research direction: knowledge discovery and individualized learning techniques.
文摘Classifying the data into a meaningful group is one of the fundamental ways of understanding and learning the valuable information. High-quality clustering methods are necessary for the valuable and efficient analysis of the increasing data. The Firefly Algorithm (FA) is one of the bio-inspired algorithms and it is recently used to solve the clustering problems. In this paper, Hybrid F-Firefly algorithm is developed by combining the Fuzzy C-Means (FCM) with FA to improve the clustering accuracy with global optimum solution. The Hybrid F-Firefly algorithm is developed by incorporating FCM operator at the end of each iteration in FA algorithm. This proposed algorithm is designed to utilize the goodness of existing algorithm and to enhance the original FA algorithm by solving the shortcomings in the FCM algorithm like the trapping in local optima and sensitive to initial seed points. In this research work, the Hybrid F-Firefly algorithm is implemented and experimentally tested for various performance measures under six different benchmark datasets. From the experimental results, it is observed that the Hybrid F-Firefly algorithm significantly improves the intra-cluster distance when compared with the existing algorithms like K-means, FCM and FA algorithm.
基金supported by the National Key Research and Development Program of China(2018YFB1201500)
文摘This paper uses Gaussian interval type-2 fuzzy se theory on historical traffic volume data processing to obtain a 24-hour prediction of traffic volume with high precision. A K-means clustering method is used in this paper to get 5 minutes traffic volume variation as input data for the Gaussian interval type-2 fuzzy sets which can reflect the distribution of historical traffic volume in one statistical period. Moreover, the cluster with the largest collection of data obtained by K-means clustering method is calculated to get the key parameters of type-2 fuzzy sets, mean and standard deviation of the Gaussian membership function.Using the range of data as the input of Gaussian interval type-2 fuzzy sets leads to the range of traffic volume forecasting output with the ability of describing the possible range of the traffic volume as well as the traffic volume prediction data with high accuracy. The simulation results show that the average relative error is reduced to 8% based on the combined K-means Gaussian interval type-2 fuzzy sets forecasting method. The fluctuation range in terms of an upper and a lower forecasting traffic volume completely envelopes the actual traffic volume and reproduces the fluctuation range of traffic flow.
文摘Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.
基金the Fujian Provincial Natural Science Foundation of China (Z0510492006J0391)
文摘Based on rough similarity degree of rough sets and close degree of fuzzy sets, the definitions of rough similarity degree and rough close degree of rough fuzzy sets are given, which can be used to measure the similar degree between two rough fuzzy sets. The properties and theorems are listed. Using the two new measures, the method of clustering in the rough fuzzy system can be obtained. After clustering, the new fuzzy sample can be recognized by the principle of maximal similarity degree.
基金supported by the National Natural Science Foundation of China under Grant No. 30525030,60701015, and 60736029
文摘In recent years, microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for discovering groups of correlated genes potentially co-regulated or associated to the disease or conditions under investigation. Many clustering methods including k-means, fuzzy c-means, and hierarchical clustering have been widely used in literatures. Yet no comprehensive comparative study has been performed to evaluate the effectiveness of these methods, specially, in yeast saccharomyces cerevisiae. In this paper, these three gene clustering methods are compared. Classification accuracy and CPU time cost are employed for measuring performance of these algorithms. Our results show that hierarchical clustering outperforms k-means and fuzzy c-means clustering. The analysis provides deep insight to the complicated gene clustering problem of expression profile and serves as a practical guideline for routine microarray cluster analysis of gene expression.
文摘Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities in intelligent data analyzing applications are mostly represented with the help of IF-THEN rules. With the help of these rules the following tasks are solved: prediction, classification, pattern recognition and others. Using different approaches---clustering algorithms, neural network methods, fuzzy rule processing methods--we can extract rules that in an understandable language characterize the data. This allows interpreting the data, finding relationships in the data and extracting new rules that characterize them. Knowledge acquisition in this paper is defined as the process of extracting knowledge from numerical data in the form of rules. Extraction of rules in this context is based on clustering methods K-means and fuzzy C-means. With the assistance of K-means, clustering algorithm rules are derived from trained neural networks. Fuzzy C-means is used in fuzzy rule based design method. Rule extraction methodology is demonstrated in the Fisher's Iris flower data set samples. The effectiveness of the extracted rules is evaluated. Clustering and rule extraction methodology can be widely used in evaluating and analyzing various economic and financial processes.
文摘针对目前配电网用户负荷数据高维度时序数据特征提取难、交叉数据聚类处理难、负荷数据精准标签化难等问题,文章提出面向用户负荷数据的基于降噪自编码器和改进粗糙模糊K均值的特征提取与标签定义模型(feature extraction and label definition model based on DAE and improve RFKM,FLMbD-iR)。FLMbD-iR通过降噪自编码器对原始用户负荷数据进行深度特征提取后,利用基于类簇规模不均衡度量的粗糙模糊K均值进行聚类,处理聚类中簇间交叉数据存在误差的缺陷,最后构建描述指标对典型日负荷曲线进行标签定义。实验采用美国电力负荷数据进行仿真模拟,实验结果显示本方法在用户负荷数据聚类处理上效果显著。
基金Supported by the National Natural Science Foundation of China(61203099,61034008,61225016)Beijing Science and Technology Project(Z141100001414005)+3 种基金Beijing Science and Technology Special Project(Z141101004414058)Ph.D.Program Foundation from Ministry of Chinese Education(20121103120020)Beijing Nova Program(Z131104000413007)Hong Kong Scholar Program(XJ2013018)
文摘It is difficult to measure the online values of biochemical oxygen demand(BOD) due to the characteristics of nonlinear dynamics, large lag and uncertainty in wastewater treatment process. In this paper, based on the knowledge representation ability and learning capability, an improved T–S fuzzy neural network(TSFNN) is introduced to predict BOD values by the soft computing method. In this improved TSFNN, a K-means clustering is used to initialize the structure of TSFNN, including the number of fuzzy rules and parameters of membership function. For training TSFNN, a gradient descent method with the momentum item is used to adjust antecedent parameters and consequent parameters. This improved TSFNN is applied to predict the BOD values in effluent of the wastewater treatment process. The simulation results show that the TSFNN with K-means clustering algorithm can measure the BOD values accurately. The algorithm presents better approximation performance than some other methods.
基金This work has been partially supported with the grant received in research project under RUSA 2.0 component 8,Govt.of India,New Delhi.
文摘Partitional clustering techniques such as K-Means(KM),Fuzzy C-Means(FCM),and Rough K-Means(RKM)are very simple and effective techniques for image segmentation.But,because their initial cluster centers are randomly determined,it is often seen that certain clusters converge to local optima.In addition to that,pathology image segmentation is also problematic due to uneven lighting,stain,and camera settings during the microscopic image capturing process.Therefore,this study proposes an Improved Slime Mould Algorithm(ISMA)based on opposition based learning and differential evolution’s mutation strategy to perform illumination-free White Blood Cell(WBC)segmentation.The ISMA helps to overcome the local optima trapping problem of the partitional clustering techniques to some extent.This paper also performs a depth analysis by considering only color components of many well-known color spaces for clustering to find the effect of illumination over color pathology image clustering.Numerical and visual results encourage the utilization of illumination-free or color component-based clustering approaches for image segmentation.ISMA-KM and“ab”color channels of CIELab color space provide best results with above-99%accuracy for only nucleus segmentation.Whereas,for entire WBC segmentation,ISMA-KM and the“CbCr”color component of YCbCr color space provide the best results with an accuracy of above 99%.Furthermore,ISMA-KM and ISMA-RKM have the lowest and highest execution times,respectively.On the other hand,ISMA provides competitive outcomes over CEC2019 benchmark test functions compared to recent well-established and efficient Nature-Inspired Optimization Algorithms(NIOAs).