In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering a...In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.展开更多
Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease ...Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy.展开更多
The K-means algorithm is widely known for its simplicity and fastness in text clustering.However,the selection of the initial clus?tering center with the traditional K-means algorithm is some random,and therefore,the ...The K-means algorithm is widely known for its simplicity and fastness in text clustering.However,the selection of the initial clus?tering center with the traditional K-means algorithm is some random,and therefore,the fluctuations and instability of the clustering results are strongly affected by the initial clustering center.This paper proposed an algorithm to select the initial clustering center to eliminate the uncertainty of central point selection.The experiment results show that the improved K-means clustering algorithm is superior to the traditional algorithm.展开更多
The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial s...The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.展开更多
Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for t...Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for timing and deployment.To improve the response speed and jamming effect,a cluster of OADs based on an unmanned surface vehicle(USV)is proposed.The formation of the cluster determines the effectiveness of jamming.First,based on the mechanism of OAD jamming,critical conditions are identified,and a method for assessing the jamming effect is proposed.Then,for the optimization of the cluster formation,a mathematical model is built,and a multi-tribe adaptive particle swarm optimization algorithm based on mutation strategy and Metropolis criterion(3M-APSO)is designed.Finally,the formation optimization problem is solved and analyzed using the 3M-APSO algorithm under specific scenarios.The results show that the improved algorithm has a faster convergence rate and superior performance as compared to the standard Adaptive-PSO algorithm.Compared with a single OAD,the optimal formation of USV-OAD cluster effectively fills the blind area and maximizes the use of jamming resources.展开更多
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista...In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.展开更多
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study pres...Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study presents a machine learning approach based on the C5.0 decision tree(DT) model and the K-means cluster algorithm to produce a regional landslide susceptibility map. Yanchang County, a typical landslide-prone area located in northwestern China, was taken as the area of interest to introduce the proposed application procedure. A landslide inventory containing 82 landslides was prepared and subsequently randomly partitioned into two subsets: training data(70% landslide pixels) and validation data(30% landslide pixels). Fourteen landslide influencing factors were considered in the input dataset and were used to calculate the landslide occurrence probability based on the C5.0 decision tree model.Susceptibility zonation was implemented according to the cut-off values calculated by the K-means cluster algorithm. The validation results of the model performance analysis showed that the AUC(area under the receiver operating characteristic(ROC) curve) of the proposed model was the highest, reaching 0.88,compared with traditional models(support vector machine(SVM) = 0.85, Bayesian network(BN) = 0.81,frequency ratio(FR) = 0.75, weight of evidence(WOE) = 0.76). The landslide frequency ratio and frequency density of the high susceptibility zones were 6.76/km^(2) and 0.88/km^(2), respectively, which were much higher than those of the low susceptibility zones. The top 20% interval of landslide occurrence probability contained 89% of the historical landslides but only accounted for 10.3% of the total area.Our results indicate that the distribution of high susceptibility zones was more focused without containing more " stable" pixels. Therefore, the obtained susceptibility map is suitable for application to landslide risk management practices.展开更多
Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objec...Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objective classification of the Northwestern Pacific tropical cyclone tracks is established by using k-means Clustering.The TC lifespan,energy,active season and landfall probability of seven clusters of tropical cyclone tracks are comparatively analyzed.The characteristics of these parameters are quite different among different tropical cyclone track clusters.From the trend of the past two decades,the frequency of the western recurving cluster(accounting for 21.3% of the total) increased,and the lifespan elongated slightly,which differs from the other clusters.The annual variation of the Power Dissipation Index(PDI) of most clusters mainly depended on the TC intensity and frequency.However,the annual variation of the PDI in the northwestern moving then recurving cluster and the pelagic west-northwest moving cluster mainly depended on the frequency.展开更多
The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the...The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.展开更多
Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not ...Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments.展开更多
The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ...The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.展开更多
Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Qu...Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.展开更多
A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clo...A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clouds in different developmental phases,two-dimensional(2D)and three-dimensional(3D)models are proposed by applying reflectivity factors at 0.5°and at 0.5°,1.5°,and 2.4°elevation angles,respectively.According to the thresholds of the algorithm,which include echo intensity,the echo top height of 35 dBZ(ET),density threshold,andεneighborhood,cloud clusters can be marked into four types:deep-convective cloud(DCC),shallow-convective cloud(SCC),hybrid convective-stratiform cloud(HCS),and stratiform cloud(SFC)types.Each cloud cluster type is further identified as a core area and boundary area,which can provide more abundant cloud structure information.The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing,Xuzhou,and Qingdao.The results show that cloud clusters can be intuitively identified as core and boundary points,which change in area continuously during the process of convective evolution,by the improved DBSCAN algorithm.Therefore,the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification.Because density thresholds are different and multiple elevations are utilized in the 3D model,the identified echo types and areas are dissimilar between the 2D and 3D models.The 3D model identifies larger convective and stratiform clouds than the 2D model.However,the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds.In addition,the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.展开更多
In this paper, an improved k-means based clustering method (IKCM) is proposed. By refining the initial cluster centers and adjusting the number of clusters by splitting and merging procedures, it can avoid the algor...In this paper, an improved k-means based clustering method (IKCM) is proposed. By refining the initial cluster centers and adjusting the number of clusters by splitting and merging procedures, it can avoid the algorithm resulting in the situation of locally optimal solution and reduce the number of clusters dependency. The IKCM has been implemented and tested. We perform experiments on KDD-99 data set. The comparison experiments with H-means+also have been conducted. The results obtained in this study are very encouraging.展开更多
Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one...Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one of the most important technologies in network security detection.The accuracy of network intrusion detection has reached higher accuracy so far.However,these methods have very low efficiency in network intrusion detection,even the most popular SOM neural network method.In this paper,an efficient and fast network intrusion detection method was proposed.Firstly,the fundamental of the two different methods are introduced respectively.Then,the selforganizing feature map neural network based on K-means clustering(KSOM)algorithms was presented to improve the efficiency of network intrusion detection.Finally,the NSLKDD is used as network intrusion data set to demonstrate that the KSOM method can significantly reduce the number of clustering iteration than SOM method without substantially affecting the clustering results and the accuracy is much higher than Kmeans method.The Experimental results show that our method can relatively improve the accuracy of network intrusion and significantly reduce the number of clustering iteration.展开更多
To improve the segmentation quality and efficiency of color image,a novel approach which combines the advantages of the mean shift(MS) segmentation and improved ant clustering method is proposed.The regions which can ...To improve the segmentation quality and efficiency of color image,a novel approach which combines the advantages of the mean shift(MS) segmentation and improved ant clustering method is proposed.The regions which can preserve the discontinuity characteristics of an image are segmented by MS algorithm,and then they are represented by a graph in which every region is represented by a node.In order to solve the graph partition problem,an improved ant clustering algorithm,called similarity carrying ant model(SCAM-ant),is proposed,in which a new similarity calculation method is given.Using SCAM-ant,the maximum number of items that each ant can carry will increase,the clustering time will be effectively reduced,and globally optimized clustering can also be realized.Because the graph is not based on the pixels of original image but on the segmentation result of MS algorithm,the computational complexity is greatly reduced.Experiments show that the proposed method can realize color image segmentation efficiently,and compared with the conventional methods based on the image pixels,it improves the image segmentation quality and the anti-interference ability.展开更多
In recent years,with the continuous development of information technology and the rapid growth of network scale,network monitoring and management become more and more important.Network traffic is an important part of ...In recent years,with the continuous development of information technology and the rapid growth of network scale,network monitoring and management become more and more important.Network traffic is an important part of network state.In order to ensure the normal operation of the network,improve the availability of the network,find network faults in time and deal with network attacks;it is necessary to detect the abnormal traffic in the network.Abnormal traffic detection is of great significance in the actual network management.Therefore,in order to improve the accuracy and efficiency of network traffic anomaly detection,this paper proposes a comprehensive anomaly detection method based on improved GRU traffic prediction and improved K-means clustering,and cascade the traffic prediction and clustering to achieve the purpose of anomaly detection.Firstly,an improved highway-GRU algorithm HS-GRU(An improved Gate Recurrent Unit neural network based on Highway network and STL algorithm,HS-GRU)is proposed,which combines STL decomposition algorithm with highway GRU neural network and uses this improved algorithm to predict traffic.And then,we proposed the EFMS-Kmeans algorithm(An improved clustering algorithmthat combined Mean Shift algorithmbased on electrostatic force with K-means clustering)to solve the shortcoming of the traditional K-means clustering which cannot automatically determine the number of clustering.The sum of the squared errors(SSE)method and the contour coefficient method were used to double test the clustering effect.After determining the clustering center,the potential energy gradient was directly used for anomaly detection by using the threshold method,which considered the local characteristics of the data and ensured the accuracy of anomaly detection.The simulation results show that the anomaly detection algorithm based on HS-GRU and EFMS-Kmeans clustering proposed in this paper can effectively improve the accuracy of flow anomaly detection and has important application value.展开更多
For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the o...For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.展开更多
Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,th...Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.展开更多
基金The National Natural Science Foundation of China(No50674086)Specialized Research Fund for the Doctoral Program of Higher Education (No20060290508)the Youth Scientific Research Foundation of China University of Mining and Technology (No2006A047)
文摘In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy.
文摘The K-means algorithm is widely known for its simplicity and fastness in text clustering.However,the selection of the initial clus?tering center with the traditional K-means algorithm is some random,and therefore,the fluctuations and instability of the clustering results are strongly affected by the initial clustering center.This paper proposed an algorithm to select the initial clustering center to eliminate the uncertainty of central point selection.The experiment results show that the improved K-means clustering algorithm is superior to the traditional algorithm.
文摘The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.
基金the National Natural Science Foundation of China(Grant No.62101579).
文摘Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for timing and deployment.To improve the response speed and jamming effect,a cluster of OADs based on an unmanned surface vehicle(USV)is proposed.The formation of the cluster determines the effectiveness of jamming.First,based on the mechanism of OAD jamming,critical conditions are identified,and a method for assessing the jamming effect is proposed.Then,for the optimization of the cluster formation,a mathematical model is built,and a multi-tribe adaptive particle swarm optimization algorithm based on mutation strategy and Metropolis criterion(3M-APSO)is designed.Finally,the formation optimization problem is solved and analyzed using the 3M-APSO algorithm under specific scenarios.The results show that the improved algorithm has a faster convergence rate and superior performance as compared to the standard Adaptive-PSO algorithm.Compared with a single OAD,the optimal formation of USV-OAD cluster effectively fills the blind area and maximizes the use of jamming resources.
文摘In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
基金This research is funded by the National Natural Science Foundation of China(Grant Nos.41807285 and 51679117)Key Project of the State Key Laboratory of Geohazard Prevention and Geoenvironment Protection(SKLGP2019Z002)+3 种基金the National Science Foundation of Jiangxi Province,China(20192BAB216034)the China Postdoctoral Science Foundation(2019M652287 and 2020T130274)the Jiangxi Provincial Postdoctoral Science Foundation(2019KY08)Fundamental Research Funds for National Universities,China University of Geosciences(Wuhan)。
文摘Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study presents a machine learning approach based on the C5.0 decision tree(DT) model and the K-means cluster algorithm to produce a regional landslide susceptibility map. Yanchang County, a typical landslide-prone area located in northwestern China, was taken as the area of interest to introduce the proposed application procedure. A landslide inventory containing 82 landslides was prepared and subsequently randomly partitioned into two subsets: training data(70% landslide pixels) and validation data(30% landslide pixels). Fourteen landslide influencing factors were considered in the input dataset and were used to calculate the landslide occurrence probability based on the C5.0 decision tree model.Susceptibility zonation was implemented according to the cut-off values calculated by the K-means cluster algorithm. The validation results of the model performance analysis showed that the AUC(area under the receiver operating characteristic(ROC) curve) of the proposed model was the highest, reaching 0.88,compared with traditional models(support vector machine(SVM) = 0.85, Bayesian network(BN) = 0.81,frequency ratio(FR) = 0.75, weight of evidence(WOE) = 0.76). The landslide frequency ratio and frequency density of the high susceptibility zones were 6.76/km^(2) and 0.88/km^(2), respectively, which were much higher than those of the low susceptibility zones. The top 20% interval of landslide occurrence probability contained 89% of the historical landslides but only accounted for 10.3% of the total area.Our results indicate that the distribution of high susceptibility zones was more focused without containing more " stable" pixels. Therefore, the obtained susceptibility map is suitable for application to landslide risk management practices.
基金National Basic Research Program of China(973 Program)(2015CB453200),2012CB955903)National Natural Science Foundation of China(41575083,41575108)Jiangsu Education Science Foundation(13KJA170002)
文摘Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objective classification of the Northwestern Pacific tropical cyclone tracks is established by using k-means Clustering.The TC lifespan,energy,active season and landfall probability of seven clusters of tropical cyclone tracks are comparatively analyzed.The characteristics of these parameters are quite different among different tropical cyclone track clusters.From the trend of the past two decades,the frequency of the western recurving cluster(accounting for 21.3% of the total) increased,and the lifespan elongated slightly,which differs from the other clusters.The annual variation of the Power Dissipation Index(PDI) of most clusters mainly depended on the TC intensity and frequency.However,the annual variation of the PDI in the northwestern moving then recurving cluster and the pelagic west-northwest moving cluster mainly depended on the frequency.
基金This research was jointly supported by the National Natural Science Foundation of China(Grant No.42005037)the Liaoning Provincial Natural Science Foundation Project(PhD Start-up Research Fund 2019-BS-214),the Special Scientific Research Project for the Forecaster(Grant No.CMAYBY2018-018)+2 种基金a Key Technical Project of Liaoning Meteorological Bureau(Grant No.LNGJ201903)the National Key Research and Development Project(Grant No.2018YFC1505601)the Open Foundation Project of the Institute of Atmospheric Environment,China Meteorological Administration(Grant Nos.2020SYIAE08 and 2020SYIAEZD5).
文摘The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.
基金the National Natural Science Foundation of China (60672061)
文摘Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments.
文摘The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.
文摘Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.
基金funded by the Key-Area Research and Development Program of Guangdong Province(Grant No.2020B1111200001)the Key project of monitoring,early warning and prevention of major natural disasters of China(Grant No.2019YFC1510304)+1 种基金the S&T Program of Hebei(Grant No.19275408D)the Scientific Research Projects of Weather Modification in Northwest China(Grant No.RYSY201905).
文摘A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clouds in different developmental phases,two-dimensional(2D)and three-dimensional(3D)models are proposed by applying reflectivity factors at 0.5°and at 0.5°,1.5°,and 2.4°elevation angles,respectively.According to the thresholds of the algorithm,which include echo intensity,the echo top height of 35 dBZ(ET),density threshold,andεneighborhood,cloud clusters can be marked into four types:deep-convective cloud(DCC),shallow-convective cloud(SCC),hybrid convective-stratiform cloud(HCS),and stratiform cloud(SFC)types.Each cloud cluster type is further identified as a core area and boundary area,which can provide more abundant cloud structure information.The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing,Xuzhou,and Qingdao.The results show that cloud clusters can be intuitively identified as core and boundary points,which change in area continuously during the process of convective evolution,by the improved DBSCAN algorithm.Therefore,the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification.Because density thresholds are different and multiple elevations are utilized in the 3D model,the identified echo types and areas are dissimilar between the 2D and 3D models.The 3D model identifies larger convective and stratiform clouds than the 2D model.However,the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds.In addition,the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.
基金Supported by the Beijing Municipal Commission ofEducation Science and Technology Project (KM200511232004)
文摘In this paper, an improved k-means based clustering method (IKCM) is proposed. By refining the initial cluster centers and adjusting the number of clusters by splitting and merging procedures, it can avoid the algorithm resulting in the situation of locally optimal solution and reduce the number of clusters dependency. The IKCM has been implemented and tested. We perform experiments on KDD-99 data set. The comparison experiments with H-means+also have been conducted. The results obtained in this study are very encouraging.
文摘Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one of the most important technologies in network security detection.The accuracy of network intrusion detection has reached higher accuracy so far.However,these methods have very low efficiency in network intrusion detection,even the most popular SOM neural network method.In this paper,an efficient and fast network intrusion detection method was proposed.Firstly,the fundamental of the two different methods are introduced respectively.Then,the selforganizing feature map neural network based on K-means clustering(KSOM)algorithms was presented to improve the efficiency of network intrusion detection.Finally,the NSLKDD is used as network intrusion data set to demonstrate that the KSOM method can significantly reduce the number of clustering iteration than SOM method without substantially affecting the clustering results and the accuracy is much higher than Kmeans method.The Experimental results show that our method can relatively improve the accuracy of network intrusion and significantly reduce the number of clustering iteration.
基金Project(60874070) supported by the National Natural Science Foundation of China
文摘To improve the segmentation quality and efficiency of color image,a novel approach which combines the advantages of the mean shift(MS) segmentation and improved ant clustering method is proposed.The regions which can preserve the discontinuity characteristics of an image are segmented by MS algorithm,and then they are represented by a graph in which every region is represented by a node.In order to solve the graph partition problem,an improved ant clustering algorithm,called similarity carrying ant model(SCAM-ant),is proposed,in which a new similarity calculation method is given.Using SCAM-ant,the maximum number of items that each ant can carry will increase,the clustering time will be effectively reduced,and globally optimized clustering can also be realized.Because the graph is not based on the pixels of original image but on the segmentation result of MS algorithm,the computational complexity is greatly reduced.Experiments show that the proposed method can realize color image segmentation efficiently,and compared with the conventional methods based on the image pixels,it improves the image segmentation quality and the anti-interference ability.
基金supported by National Key R&D Program of China(2019YFB2103202,2019YFB2103200)Open Subject Funds of Science and Technology on Communication Networks Laboratory(6142104200106).
文摘In recent years,with the continuous development of information technology and the rapid growth of network scale,network monitoring and management become more and more important.Network traffic is an important part of network state.In order to ensure the normal operation of the network,improve the availability of the network,find network faults in time and deal with network attacks;it is necessary to detect the abnormal traffic in the network.Abnormal traffic detection is of great significance in the actual network management.Therefore,in order to improve the accuracy and efficiency of network traffic anomaly detection,this paper proposes a comprehensive anomaly detection method based on improved GRU traffic prediction and improved K-means clustering,and cascade the traffic prediction and clustering to achieve the purpose of anomaly detection.Firstly,an improved highway-GRU algorithm HS-GRU(An improved Gate Recurrent Unit neural network based on Highway network and STL algorithm,HS-GRU)is proposed,which combines STL decomposition algorithm with highway GRU neural network and uses this improved algorithm to predict traffic.And then,we proposed the EFMS-Kmeans algorithm(An improved clustering algorithmthat combined Mean Shift algorithmbased on electrostatic force with K-means clustering)to solve the shortcoming of the traditional K-means clustering which cannot automatically determine the number of clustering.The sum of the squared errors(SSE)method and the contour coefficient method were used to double test the clustering effect.After determining the clustering center,the potential energy gradient was directly used for anomaly detection by using the threshold method,which considered the local characteristics of the data and ensured the accuracy of anomaly detection.The simulation results show that the anomaly detection algorithm based on HS-GRU and EFMS-Kmeans clustering proposed in this paper can effectively improve the accuracy of flow anomaly detection and has important application value.
文摘For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.
基金supported by the National Magnetic Confinement Fusion Science Program of China(Nos.2018YFE0301104 and 2018YFE0301100)National Natural Science Foundation of China(Nos.12075096 and 51821005)。
文摘Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.