Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vi...Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vision lossin diabetic patients.Today’s development in science has no medication to cureDiabetic Retinopathy.However,if diagnosed at an early stage it can be controlledand permanent vision loss can be avoided.Compared to the diabetic population,experts to diagnose Diabetic Retinopathy are very less in particular to local areas.Hence an automatic computer-aided diagnosis for DR detection is necessary.Inthis paper,we propose an unsupervised clustering technique to automatically clusterthe DR into one of its five development stages.The deep learning based unsupervisedclustering is made to improve itself with the help of fuzzy rough c-meansclustering where cluster centers are updated by fuzzy rough c-means clusteringalgorithm during the forward pass and the deep learning model representationsare updated by Stochastic Gradient Descent during the backward pass of training.The proposed method was implemented using python and the results were takenon DGX server with Tesla V100 GPU cards.An experimental result on the publicallyavailable Kaggle dataset shows an overall accuracy of 88.7%.The proposedmodel improves the accuracy of DR diagnosis compared to the existingunsupervised algorithms like k-means,FCM,auto-encoder,and FRCM withalexnet.展开更多
Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced ...Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.展开更多
Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications i...Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems(IDS)have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8%with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017)dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9%for unknown DDoS attacks on the DDoS Evaluation Dataset(CICDDoS2019)dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.展开更多
Brain tumor is a major cause of an increased transient between children and adults. This article proposes an improved method based on magnetic resonance (MRI) brain imaging and image segmentation. Automated classifi...Brain tumor is a major cause of an increased transient between children and adults. This article proposes an improved method based on magnetic resonance (MRI) brain imaging and image segmentation. Automated classification is encouraged by the need for high accuracy in dealing with a human life. Detection of brain tumor is a challenging problem due to the high diversity in tumor appearance and ambiguous tumor boundaries. MRI images are chosen for the detection of brain tumors as they are used in the determination of soft tissues. First, image preprocessing is used to improve image quality. Second, the multi-scale decomposition of complex dual-wavelet tree transformations is used to analyze the texture of an image. Resource extraction draws resources from an image using gray-level co-occurrence matrix (GLCM). Therefore, the neuro-fuzzy technique is used to classify brain tumor stages as benign, malignant, or normal based on texture characteristics. Finally, tumor location is detected using Otsu threshold. The performance of the classifier is evaluated on the basis of classification accuracies. The simulated results show that the proposed classifier provides better accuracy than the previous method.展开更多
Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the ...Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the prototype of each cluster. By integrating feature weights, a formula for weight calculation is introduced to the clustering algorithm. The selection of weight exponent is crucial for good result and the weights are updated iteratively with each partition of clusters. The convergence of the weighted algorithms is given, and the feasible cluster validity indices of data mining application are utilized. Experimental results on both synthetic and real-life numerical data with different feature weights demonstrate that the weighted algorithm is better than the other unweighted algorithms.展开更多
To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Con...To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.展开更多
A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive...A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.展开更多
To solve the traveling salesman problem with the characteristics of clustering,a novel hybrid algorithm,the ant colony algorithm combined with the C-means algorithm,is presented.In order to improve the speed of conver...To solve the traveling salesman problem with the characteristics of clustering,a novel hybrid algorithm,the ant colony algorithm combined with the C-means algorithm,is presented.In order to improve the speed of convergence,the traveling salesman problem(TSP)data is specially clustered by the C-means algorithm,then,the result is processed by the ant colony algorithm to solve the problem.The proposed algorithm treats the C-means algorithm as a new search operator and adopts a kind of local searching strategy—2-opt,so as to improve the searching performance.Given the cluster number,the algorithm can obtain the preferable solving result.Compared with the three other algorithms—the ant colony algorithm,the genetic algorithm and the simulated annealing algorithm,the proposed algorithm can make the results converge to the global optimum faster and it has higher accuracy.The algorithm can also be extended to solve other correlative clustering combination optimization problems.Experimental results indicate the validity of the proposed algorithm.展开更多
针对Mapreduce机制下算法通信时间占用比过高,实际应用价值受限的情况,提出基于Hadoop二阶段并行c-Means聚类算法用来解决超大数据的分类问题。首先,改进Mapreduce机制下的MPI通信管理方法,采用成员管理协议方式实现成员管理与Mapreduc...针对Mapreduce机制下算法通信时间占用比过高,实际应用价值受限的情况,提出基于Hadoop二阶段并行c-Means聚类算法用来解决超大数据的分类问题。首先,改进Mapreduce机制下的MPI通信管理方法,采用成员管理协议方式实现成员管理与Mapreduce降低操作的同步化;其次,实行典型个体组降低操作代替全局个体降低操作,并定义二阶段缓冲算法;最后,通过第一阶段的缓冲进一步降低第二阶段Mapreduce操作的数据量,尽可能降低大数据带来的对算法负面影响。在此基础上,利用人造大数据测试集和KDD CUP 99入侵测试集进行仿真,实验结果表明,该算法既能保证聚类精度要求又可有效加快算法运行效率。展开更多
文摘Diabetic Retinopathy(DR)is a vision disease due to the long-term prevalenceof Diabetes Mellitus.It affects the retina of the eye and causes severedamage to the vision.If not treated on time it may lead to permanent vision lossin diabetic patients.Today’s development in science has no medication to cureDiabetic Retinopathy.However,if diagnosed at an early stage it can be controlledand permanent vision loss can be avoided.Compared to the diabetic population,experts to diagnose Diabetic Retinopathy are very less in particular to local areas.Hence an automatic computer-aided diagnosis for DR detection is necessary.Inthis paper,we propose an unsupervised clustering technique to automatically clusterthe DR into one of its five development stages.The deep learning based unsupervisedclustering is made to improve itself with the help of fuzzy rough c-meansclustering where cluster centers are updated by fuzzy rough c-means clusteringalgorithm during the forward pass and the deep learning model representationsare updated by Stochastic Gradient Descent during the backward pass of training.The proposed method was implemented using python and the results were takenon DGX server with Tesla V100 GPU cards.An experimental result on the publicallyavailable Kaggle dataset shows an overall accuracy of 88.7%.The proposedmodel improves the accuracy of DR diagnosis compared to the existingunsupervised algorithms like k-means,FCM,auto-encoder,and FRCM withalexnet.
基金The National Natural Science Foundation of China(No.62262011)the Natural Science Foundation of Guangxi(No.2021JJA170130).
文摘Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.
基金This research was partly supported by the National Science and Technology Council,Taiwan with Grant Numbers 112-2221-E-992-045,112-2221-E-992-057-MY3 and 112-2622-8-992-009-TD1.
文摘Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems(IDS)have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8%with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017)dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9%for unknown DDoS attacks on the DDoS Evaluation Dataset(CICDDoS2019)dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.
文摘Brain tumor is a major cause of an increased transient between children and adults. This article proposes an improved method based on magnetic resonance (MRI) brain imaging and image segmentation. Automated classification is encouraged by the need for high accuracy in dealing with a human life. Detection of brain tumor is a challenging problem due to the high diversity in tumor appearance and ambiguous tumor boundaries. MRI images are chosen for the detection of brain tumors as they are used in the determination of soft tissues. First, image preprocessing is used to improve image quality. Second, the multi-scale decomposition of complex dual-wavelet tree transformations is used to analyze the texture of an image. Resource extraction draws resources from an image using gray-level co-occurrence matrix (GLCM). Therefore, the neuro-fuzzy technique is used to classify brain tumor stages as benign, malignant, or normal based on texture characteristics. Finally, tumor location is detected using Otsu threshold. The performance of the classifier is evaluated on the basis of classification accuracies. The simulated results show that the proposed classifier provides better accuracy than the previous method.
基金Supported by the National Natural Science Foundation of China(61139002)~~
文摘Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the prototype of each cluster. By integrating feature weights, a formula for weight calculation is introduced to the clustering algorithm. The selection of weight exponent is crucial for good result and the weights are updated iteratively with each partition of clusters. The convergence of the weighted algorithms is given, and the feasible cluster validity indices of data mining application are utilized. Experimental results on both synthetic and real-life numerical data with different feature weights demonstrate that the weighted algorithm is better than the other unweighted algorithms.
基金The National Natural Science Foundation of China(No60672056)Open Fund of MOE-MS Key Laboratory of Multime-dia Computing and Communication(No06120809)
文摘To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.
文摘A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.
基金The National Key Technology R&D Program of China during the 11th Five-Year Plan Period(No.2006BAH02A06)
文摘To solve the traveling salesman problem with the characteristics of clustering,a novel hybrid algorithm,the ant colony algorithm combined with the C-means algorithm,is presented.In order to improve the speed of convergence,the traveling salesman problem(TSP)data is specially clustered by the C-means algorithm,then,the result is processed by the ant colony algorithm to solve the problem.The proposed algorithm treats the C-means algorithm as a new search operator and adopts a kind of local searching strategy—2-opt,so as to improve the searching performance.Given the cluster number,the algorithm can obtain the preferable solving result.Compared with the three other algorithms—the ant colony algorithm,the genetic algorithm and the simulated annealing algorithm,the proposed algorithm can make the results converge to the global optimum faster and it has higher accuracy.The algorithm can also be extended to solve other correlative clustering combination optimization problems.Experimental results indicate the validity of the proposed algorithm.
文摘针对Mapreduce机制下算法通信时间占用比过高,实际应用价值受限的情况,提出基于Hadoop二阶段并行c-Means聚类算法用来解决超大数据的分类问题。首先,改进Mapreduce机制下的MPI通信管理方法,采用成员管理协议方式实现成员管理与Mapreduce降低操作的同步化;其次,实行典型个体组降低操作代替全局个体降低操作,并定义二阶段缓冲算法;最后,通过第一阶段的缓冲进一步降低第二阶段Mapreduce操作的数据量,尽可能降低大数据带来的对算法负面影响。在此基础上,利用人造大数据测试集和KDD CUP 99入侵测试集进行仿真,实验结果表明,该算法既能保证聚类精度要求又可有效加快算法运行效率。