Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications i...Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems(IDS)have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8%with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017)dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9%for unknown DDoS attacks on the DDoS Evaluation Dataset(CICDDoS2019)dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.展开更多
Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced ...Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.展开更多
Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the ...Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the prototype of each cluster. By integrating feature weights, a formula for weight calculation is introduced to the clustering algorithm. The selection of weight exponent is crucial for good result and the weights are updated iteratively with each partition of clusters. The convergence of the weighted algorithms is given, and the feasible cluster validity indices of data mining application are utilized. Experimental results on both synthetic and real-life numerical data with different feature weights demonstrate that the weighted algorithm is better than the other unweighted algorithms.展开更多
A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive...A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.展开更多
To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Con...To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.展开更多
To solve the traveling salesman problem with the characteristics of clustering,a novel hybrid algorithm,the ant colony algorithm combined with the C-means algorithm,is presented.In order to improve the speed of conver...To solve the traveling salesman problem with the characteristics of clustering,a novel hybrid algorithm,the ant colony algorithm combined with the C-means algorithm,is presented.In order to improve the speed of convergence,the traveling salesman problem(TSP)data is specially clustered by the C-means algorithm,then,the result is processed by the ant colony algorithm to solve the problem.The proposed algorithm treats the C-means algorithm as a new search operator and adopts a kind of local searching strategy—2-opt,so as to improve the searching performance.Given the cluster number,the algorithm can obtain the preferable solving result.Compared with the three other algorithms—the ant colony algorithm,the genetic algorithm and the simulated annealing algorithm,the proposed algorithm can make the results converge to the global optimum faster and it has higher accuracy.The algorithm can also be extended to solve other correlative clustering combination optimization problems.Experimental results indicate the validity of the proposed algorithm.展开更多
针对Mapreduce机制下算法通信时间占用比过高,实际应用价值受限的情况,提出基于Hadoop二阶段并行c-Means聚类算法用来解决超大数据的分类问题。首先,改进Mapreduce机制下的MPI通信管理方法,采用成员管理协议方式实现成员管理与Mapreduc...针对Mapreduce机制下算法通信时间占用比过高,实际应用价值受限的情况,提出基于Hadoop二阶段并行c-Means聚类算法用来解决超大数据的分类问题。首先,改进Mapreduce机制下的MPI通信管理方法,采用成员管理协议方式实现成员管理与Mapreduce降低操作的同步化;其次,实行典型个体组降低操作代替全局个体降低操作,并定义二阶段缓冲算法;最后,通过第一阶段的缓冲进一步降低第二阶段Mapreduce操作的数据量,尽可能降低大数据带来的对算法负面影响。在此基础上,利用人造大数据测试集和KDD CUP 99入侵测试集进行仿真,实验结果表明,该算法既能保证聚类精度要求又可有效加快算法运行效率。展开更多
Fuzzy c-means (FCM) algorithm is one of the most popular methods for image segmentation. However, the standard FCM algorithm is sensitive to noise because of not taking into account the spatial information in the im...Fuzzy c-means (FCM) algorithm is one of the most popular methods for image segmentation. However, the standard FCM algorithm is sensitive to noise because of not taking into account the spatial information in the image. An improved FCM algorithm is proposed to improve the antinoise performance of FCM algorithm. The new algorithm is formulated by incorporating the spatial neighborhood information into the membership function for clustering. The distribution statistics of the neighborhood pixels and the prior probability are used to form a new membership func- tion. It is not only effective to remove the noise spots but also can reduce the misclassified pixels. Experimental results indicate that the proposed algorithm is more accurate and robust to noise than the standard FCM algorithm.展开更多
In this paper,we elaborate on residual-driven Fuzzy C-Means(FCM)for image segmentation,which is the first approach that realizes accurate residual(noise/outliers)estimation and enables noise-free image to participate ...In this paper,we elaborate on residual-driven Fuzzy C-Means(FCM)for image segmentation,which is the first approach that realizes accurate residual(noise/outliers)estimation and enables noise-free image to participate in clustering.We propose a residual-driven FCM framework by integrating into FCM a residual-related regularization term derived from the distribution characteristic of different types of noise.Built on this framework,a weighted?2-norm regularization term is presented by weighting mixed noise distribution,thus resulting in a universal residual-driven FCM algorithm in presence of mixed or unknown noise.Besides,with the constraint of spatial information,the residual estimation becomes more reliable than that only considering an observed image itself.Supporting experiments on synthetic,medical,and real-world images are conducted.The results demonstrate the superior effectiveness and efficiency of the proposed algorithm over its peers.展开更多
The complex geometry and topology of soil is widely recognised as the key driver in many ecological processes. X-ray computed tomography (CT) provides insight into the internal structure of soil pores automatically an...The complex geometry and topology of soil is widely recognised as the key driver in many ecological processes. X-ray computed tomography (CT) provides insight into the internal structure of soil pores automatically and accurately. Until recently, there have not been methods to identify soil pore structures. This has restricted the development of soil science, particularly regarding pore geometry and spatial distribution. Through the adoption of the fuzzy clustering theory and the establishment of pore identification rules, a novel pore identification method is described to extract pore structures from CT soil images. The robustness of the adaptive fuzzy C-means method (AFCM), the adaptive threshold method, and Image-Pro Plus tools were compared on soil specimens under different conditions, such as frozen, saturated, and dry situations. The results demonstrate that the AFCM method is suitable for identifying pore clusters, especially tiny pores, under various soil conditions. The method would provide an optional technique for the study of soil micromorphology.展开更多
基金This research was partly supported by the National Science and Technology Council,Taiwan with Grant Numbers 112-2221-E-992-045,112-2221-E-992-057-MY3 and 112-2622-8-992-009-TD1.
文摘Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems(IDS)have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8%with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017)dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9%for unknown DDoS attacks on the DDoS Evaluation Dataset(CICDDoS2019)dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.
基金The National Natural Science Foundation of China(No.62262011)the Natural Science Foundation of Guangxi(No.2021JJA170130).
文摘Fuzzy C-Means(FCM)is an effective and widely used clustering algorithm,but there are still some problems.considering the number of clusters must be determined manually,the local optimal solutions is easily influenced by the random selection of initial cluster centers,and the performance of Euclid distance in complex high-dimensional data is poor.To solve the above problems,the improved FCM clustering algorithm based on density Canopy and Manifold learning(DM-FCM)is proposed.First,a density Canopy algorithm based on improved local density is proposed to automatically deter-mine the number of clusters and initial cluster centers,which improves the self-adaptability and stability of the algorithm.Then,considering that high-dimensional data often present a nonlinear structure,the manifold learning method is applied to construct a manifold spatial structure,which preserves the global geometric properties of complex high-dimensional data and improves the clustering effect of the algorithm on complex high-dimensional datasets.Fowlkes-Mallows Index(FMI),the weighted average of homogeneity and completeness(V-measure),Adjusted Mutual Information(AMI),and Adjusted Rand Index(ARI)are used as performance measures of clustering algorithms.The experimental results show that the manifold learning method is the superior distance measure,and the algorithm improves the clustering accuracy and performs superiorly in the clustering of low-dimensional and complex high-dimensional data.
基金Supported by the National Natural Science Foundation of China(61139002)~~
文摘Partition-based clustering with weighted feature is developed in the framework of shadowed sets. The objects in the core and boundary regions, generated by shadowed sets-based clustering, have different impact on the prototype of each cluster. By integrating feature weights, a formula for weight calculation is introduced to the clustering algorithm. The selection of weight exponent is crucial for good result and the weights are updated iteratively with each partition of clusters. The convergence of the weighted algorithms is given, and the feasible cluster validity indices of data mining application are utilized. Experimental results on both synthetic and real-life numerical data with different feature weights demonstrate that the weighted algorithm is better than the other unweighted algorithms.
文摘A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.
基金The National Natural Science Foundation of China(No60672056)Open Fund of MOE-MS Key Laboratory of Multime-dia Computing and Communication(No06120809)
文摘To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.
基金The National Key Technology R&D Program of China during the 11th Five-Year Plan Period(No.2006BAH02A06)
文摘To solve the traveling salesman problem with the characteristics of clustering,a novel hybrid algorithm,the ant colony algorithm combined with the C-means algorithm,is presented.In order to improve the speed of convergence,the traveling salesman problem(TSP)data is specially clustered by the C-means algorithm,then,the result is processed by the ant colony algorithm to solve the problem.The proposed algorithm treats the C-means algorithm as a new search operator and adopts a kind of local searching strategy—2-opt,so as to improve the searching performance.Given the cluster number,the algorithm can obtain the preferable solving result.Compared with the three other algorithms—the ant colony algorithm,the genetic algorithm and the simulated annealing algorithm,the proposed algorithm can make the results converge to the global optimum faster and it has higher accuracy.The algorithm can also be extended to solve other correlative clustering combination optimization problems.Experimental results indicate the validity of the proposed algorithm.
文摘针对Mapreduce机制下算法通信时间占用比过高,实际应用价值受限的情况,提出基于Hadoop二阶段并行c-Means聚类算法用来解决超大数据的分类问题。首先,改进Mapreduce机制下的MPI通信管理方法,采用成员管理协议方式实现成员管理与Mapreduce降低操作的同步化;其次,实行典型个体组降低操作代替全局个体降低操作,并定义二阶段缓冲算法;最后,通过第一阶段的缓冲进一步降低第二阶段Mapreduce操作的数据量,尽可能降低大数据带来的对算法负面影响。在此基础上,利用人造大数据测试集和KDD CUP 99入侵测试集进行仿真,实验结果表明,该算法既能保证聚类精度要求又可有效加快算法运行效率。
基金supported by the National Natural Science Foundation of China(6087403160740430664)
文摘Fuzzy c-means (FCM) algorithm is one of the most popular methods for image segmentation. However, the standard FCM algorithm is sensitive to noise because of not taking into account the spatial information in the image. An improved FCM algorithm is proposed to improve the antinoise performance of FCM algorithm. The new algorithm is formulated by incorporating the spatial neighborhood information into the membership function for clustering. The distribution statistics of the neighborhood pixels and the prior probability are used to form a new membership func- tion. It is not only effective to remove the noise spots but also can reduce the misclassified pixels. Experimental results indicate that the proposed algorithm is more accurate and robust to noise than the standard FCM algorithm.
基金supported in part by the Doctoral Students’Short Term Study Abroad Scholarship Fund of Xidian Universitythe National Natural Science Foundation of China(61873342,61672400,62076189)+1 种基金the Recruitment Program of Global Expertsthe Science and Technology Development Fund,MSAR(0012/2019/A1)。
文摘In this paper,we elaborate on residual-driven Fuzzy C-Means(FCM)for image segmentation,which is the first approach that realizes accurate residual(noise/outliers)estimation and enables noise-free image to participate in clustering.We propose a residual-driven FCM framework by integrating into FCM a residual-related regularization term derived from the distribution characteristic of different types of noise.Built on this framework,a weighted?2-norm regularization term is presented by weighting mixed noise distribution,thus resulting in a universal residual-driven FCM algorithm in presence of mixed or unknown noise.Besides,with the constraint of spatial information,the residual estimation becomes more reliable than that only considering an observed image itself.Supporting experiments on synthetic,medical,and real-world images are conducted.The results demonstrate the superior effectiveness and efficiency of the proposed algorithm over its peers.
基金supported by the National Natural Science Youth Foundation of China(No.41501283)the Fundamental Research Funds for the Central Universities(2015ZCQGX-04)
文摘The complex geometry and topology of soil is widely recognised as the key driver in many ecological processes. X-ray computed tomography (CT) provides insight into the internal structure of soil pores automatically and accurately. Until recently, there have not been methods to identify soil pore structures. This has restricted the development of soil science, particularly regarding pore geometry and spatial distribution. Through the adoption of the fuzzy clustering theory and the establishment of pore identification rules, a novel pore identification method is described to extract pore structures from CT soil images. The robustness of the adaptive fuzzy C-means method (AFCM), the adaptive threshold method, and Image-Pro Plus tools were compared on soil specimens under different conditions, such as frozen, saturated, and dry situations. The results demonstrate that the AFCM method is suitable for identifying pore clusters, especially tiny pores, under various soil conditions. The method would provide an optional technique for the study of soil micromorphology.