Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial s...The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.展开更多
To solve the problems of a few optical fibre line fault samples and the inefficiency of manual communication optical fibre fault diagnosis,this paper proposes a communication optical fibre fault diagnosis model based ...To solve the problems of a few optical fibre line fault samples and the inefficiency of manual communication optical fibre fault diagnosis,this paper proposes a communication optical fibre fault diagnosis model based on variational modal decomposition(VMD),fuzzy entropy(FE)and fuzzy clustering(FC).Firstly,based on the OTDR curve data collected in the field,VMD is used to extract the different modal components(IMF)of the original signal and calculate the fuzzy entropy(FE)values of different components to characterize the subtle differences between them.The fuzzy entropy of each curve is used as the feature vector,which in turn constructs the communication optical fibre feature vector matrix,and the fuzzy clustering algorithm is used to achieve fault diagnosis of faulty optical fibre.The VMD-FE combination can extract subtle differences in features,and the fuzzy clustering algorithm does not require sample training.The experimental results show that the model in this paper has high accuracy and is relevant to the maintenance of communication optical fibre when compared with existing feature extraction models and traditional machine learning models.展开更多
Wireless sensor networks(WSN)gather information and sense information samples in a certain region and communicate these readings to a base station(BS).Energy efficiency is considered a major design issue in the WSNs,a...Wireless sensor networks(WSN)gather information and sense information samples in a certain region and communicate these readings to a base station(BS).Energy efficiency is considered a major design issue in the WSNs,and can be addressed using clustering and routing techniques.Information is sent from the source to the BS via routing procedures.However,these routing protocols must ensure that packets are delivered securely,guaranteeing that neither adversaries nor unauthentic individuals have access to the sent information.Secure data transfer is intended to protect the data from illegal access,damage,or disruption.Thus,in the proposed model,secure data transmission is developed in an energy-effective manner.A low-energy adaptive clustering hierarchy(LEACH)is developed to efficiently transfer the data.For the intrusion detection systems(IDS),Fuzzy logic and artificial neural networks(ANNs)are proposed.Initially,the nodes were randomly placed in the network and initialized to gather information.To ensure fair energy dissipation between the nodes,LEACH randomly chooses cluster heads(CHs)and allocates this role to the various nodes based on a round-robin management mechanism.The intrusion-detection procedure was then utilized to determine whether intruders were present in the network.Within the WSN,a Fuzzy interference rule was utilized to distinguish the malicious nodes from legal nodes.Subsequently,an ANN was employed to distinguish the harmful nodes from suspicious nodes.The effectiveness of the proposed approach was validated using metrics that attained 97%accuracy,97%specificity,and 97%sensitivity of 95%.Thus,it was proved that the LEACH and Fuzzy-based IDS approaches are the best choices for securing data transmission in an energy-efficient manner.展开更多
Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,th...Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.展开更多
Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Wes...Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Western Iraq as an example,a new reservoir classification and discrimination method is established by using the K-means clustering method and the Bayesian discrimination method.These methods are applied to non-cored wells to calculate the discrimination accuracy of the reservoir type,and thus the main reasons for low accuracy of reservoir discrimination are clarified.The results show that the discrimination accuracy of reservoir type based on K-means clustering and Bayesian stepwise discrimination is strongly related to the accuracy of the core data.The discrimination accuracy rate of TypeⅠ,TypeⅡ,and TypeⅤreservoirs is found to be significantly higher than that of TypeⅢand TypeⅣreservoirs using the method of combining K-means clustering and Bayesian theory based on logging data.Although the recognition accuracy of the new methodology for the TypeⅣreservoir is low,with average accuracy the new method has reached more than 82%in the entire study area,which lays a good foundation for rapid and accurate discrimination of reservoir types and the fine evaluation of a reservoir.展开更多
Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease ...Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy.展开更多
The COVID-19 pandemic has caused an unprecedented spike in confirmed cases in 230 countries globally. In this work, a set of data from the COVID-19 coronavirus outbreak has been subjected to two well-known unsupervise...The COVID-19 pandemic has caused an unprecedented spike in confirmed cases in 230 countries globally. In this work, a set of data from the COVID-19 coronavirus outbreak has been subjected to two well-known unsupervised learning techniques: K-means clustering and correlation. The COVID-19 virus has infected several nations, and K-means automatically looks for undiscovered clusters of those infections. To examine the spread of COVID-19 before a vaccine becomes widely available, this work has used unsupervised approaches to identify the crucial county-level confirmed cases, death cases, recover cases, total_cases_per_million, and total_deaths_per_million aspects of county-level variables. We combined countries into significant clusters using this feature subspace to assist more in-depth disease analysis efforts. As a result, we used a clustering technique to examine various trends in COVID-19 incidence and mortality across nations. This technique took the key components of a trajectory and incorporates them into a K-means clustering process. We separated the trend lines into measures that characterize various features of a trend. The measurements were first reduced in dimension, then clustered using a K-means algorithm. This method was used to individually calculate the incidence and death rates and then compare them.展开更多
Many forecasting models based on the concepts of Fuzzy time series have been proposed in the past decades. These models have been widely applied to various problem domains, especially in dealing with forecasting probl...Many forecasting models based on the concepts of Fuzzy time series have been proposed in the past decades. These models have been widely applied to various problem domains, especially in dealing with forecasting problems in which historical data are linguistic values. In this paper, we present a new fuzzy time series forecasting model, which uses the historical data as the universe of discourse and uses the K-means clustering algorithm to cluster the universe of discourse, then adjust the clusters into intervals. The proposed method is applied for forecasting University enrollment of Alabama. It is shown that the proposed model achieves a significant improvement in forecasting accuracy as compared to other fuzzy time series forecasting models.展开更多
An improved fuzzy time series algorithmbased on clustering is designed in this paper.The algorithm is successfully applied to short-term load forecasting in the distribution stations.Firstly,the K-means clustering met...An improved fuzzy time series algorithmbased on clustering is designed in this paper.The algorithm is successfully applied to short-term load forecasting in the distribution stations.Firstly,the K-means clustering method is used to cluster the data,and the midpoint of two adjacent clustering centers is taken as the dividing point of domain division.On this basis,the data is fuzzed to form a fuzzy time series.Secondly,a high-order fuzzy relation with multiple antecedents is established according to the main measurement indexes of power load,which is used to predict the short-term trend change of load in the distribution stations.Matlab/Simulink simulation results show that the load forecasting errors of the typical fuzzy time series on the time scale of one day and one week are[−50,20]and[−50,30],while the load forecasting errors of the improved fuzzy time series on the time scale of one day and one week are[−20,15]and[−20,25].It shows that the fuzzy time series algorithm improved by clustering improves the prediction accuracy and can effectively predict the short-term load trend of distribution stations.展开更多
Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize cl...Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.展开更多
Encephalitis is a brain inflammation disease.Encephalitis can yield to seizures,motor disability,or some loss of vision or hearing.Sometimes,encepha-litis can be a life-threatening and proper diagnosis in an early stag...Encephalitis is a brain inflammation disease.Encephalitis can yield to seizures,motor disability,or some loss of vision or hearing.Sometimes,encepha-litis can be a life-threatening and proper diagnosis in an early stage is very crucial.Therefore,in this paper,we are proposing a deep learning model for computerized detection of Encephalitis from the electroencephalogram data(EEG).Also,we propose a Density-Based Clustering model to classify the distinctive waves of Encephalitis.Customary clustering models usually employ a computed single centroid virtual point to define the cluster configuration,but this single point does not contain adequate information.To precisely extract accurate inner structural data,a multiple centroids approach is employed and defined in this paper,which defines the cluster configuration by allocating weights to each state in the cluster.The multiple EEG view fuzzy learning approach incorporates data from every sin-gle view to enhance the model's clustering performance.Also a fuzzy Density-Based Clustering model with multiple centroids(FDBC)is presented.This model employs multiple real state centroids to define clusters using Partitioning Around Centroids algorithm.The Experimental results validate the medical importance of the proposed clustering model.展开更多
Recently,the fundamental problem with Hybrid Mobile Ad-hoc Net-works(H-MANETs)is tofind a suitable and secure way of balancing the load through Internet gateways.Moreover,the selection of the gateway and overload of th...Recently,the fundamental problem with Hybrid Mobile Ad-hoc Net-works(H-MANETs)is tofind a suitable and secure way of balancing the load through Internet gateways.Moreover,the selection of the gateway and overload of the network results in packet loss and Delay(DL).For optimal performance,it is important to load balance between different gateways.As a result,a stable load balancing procedure is implemented,which selects gateways based on Fuzzy Logic(FL)and increases the efficiency of the network.In this case,since gate-ways are selected based on the number of nodes,the Energy Consumption(EC)was high.This paper presents a novel Node Quality-based Clustering Algo-rithm(NQCA)based on Fuzzy-Genetic for Cluster Head and Gateway Selection(FGCHGS).This algorithm combines NQCA with the Improved Weighted Clus-tering Algorithm(IWCA).The NQCA algorithm divides the network into clusters based upon node priority,transmission range,and neighbourfidelity.In addition,the simulation results tend to evaluate the performance effectiveness of the FFFCHGS algorithm in terms of EC,packet loss rate(PLR),etc.展开更多
With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow e...With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow explosively.These multi-source heterogeneous data have data differences,which lead to data variation in the process of transmission and preservation,thus forming the bad information of incomplete data.Therefore,the research on data integrity has become an urgent task.This paper is based on the characteristics of random chance and the Spatio-temporal difference of the system.According to the characteristics and data sources of the massive data generated by power equipment,the fuzzy mining model of power equipment data is established,and the data is divided into numerical and non-numerical data based on numerical data.Take the text data of power equipment defects as the mining material.Then,the Apriori algorithm based on an array is used to mine deeply.The strong association rules in incomplete data of power equipment are obtained and analyzed.From the change trend of NRMSE metrics and classification accuracy,most of the filling methods combined with the two frameworks in this method usually show a relatively stable filling trend,and will not fluctuate greatly with the growth of the missing rate.The experimental results show that the proposed algorithm model can effectively improve the filling effect of the existing filling methods on most data sets,and the filling effect fluctuates greatly with the increase of the missing rate,that is,with the increase of the missing rate,the improvement effect of the model for the existing filling methods is higher than 4.3%.Through the incomplete data clustering technology studied in this paper,a more innovative state assessment of smart grid reliability operation is carried out,which has good research value and reference significance.展开更多
Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not ...Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments.展开更多
The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ...The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.展开更多
Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Qu...Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.展开更多
Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objec...Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objective classification of the Northwestern Pacific tropical cyclone tracks is established by using k-means Clustering.The TC lifespan,energy,active season and landfall probability of seven clusters of tropical cyclone tracks are comparatively analyzed.The characteristics of these parameters are quite different among different tropical cyclone track clusters.From the trend of the past two decades,the frequency of the western recurving cluster(accounting for 21.3% of the total) increased,and the lifespan elongated slightly,which differs from the other clusters.The annual variation of the Power Dissipation Index(PDI) of most clusters mainly depended on the TC intensity and frequency.However,the annual variation of the PDI in the northwestern moving then recurving cluster and the pelagic west-northwest moving cluster mainly depended on the frequency.展开更多
Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one...Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one of the most important technologies in network security detection.The accuracy of network intrusion detection has reached higher accuracy so far.However,these methods have very low efficiency in network intrusion detection,even the most popular SOM neural network method.In this paper,an efficient and fast network intrusion detection method was proposed.Firstly,the fundamental of the two different methods are introduced respectively.Then,the selforganizing feature map neural network based on K-means clustering(KSOM)algorithms was presented to improve the efficiency of network intrusion detection.Finally,the NSLKDD is used as network intrusion data set to demonstrate that the KSOM method can significantly reduce the number of clustering iteration than SOM method without substantially affecting the clustering results and the accuracy is much higher than Kmeans method.The Experimental results show that our method can relatively improve the accuracy of network intrusion and significantly reduce the number of clustering iteration.展开更多
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
文摘The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.
基金This paper is supported by State Grid Gansu Electric Power Company Science and Technology Project(20220515003).
文摘To solve the problems of a few optical fibre line fault samples and the inefficiency of manual communication optical fibre fault diagnosis,this paper proposes a communication optical fibre fault diagnosis model based on variational modal decomposition(VMD),fuzzy entropy(FE)and fuzzy clustering(FC).Firstly,based on the OTDR curve data collected in the field,VMD is used to extract the different modal components(IMF)of the original signal and calculate the fuzzy entropy(FE)values of different components to characterize the subtle differences between them.The fuzzy entropy of each curve is used as the feature vector,which in turn constructs the communication optical fibre feature vector matrix,and the fuzzy clustering algorithm is used to achieve fault diagnosis of faulty optical fibre.The VMD-FE combination can extract subtle differences in features,and the fuzzy clustering algorithm does not require sample training.The experimental results show that the model in this paper has high accuracy and is relevant to the maintenance of communication optical fibre when compared with existing feature extraction models and traditional machine learning models.
文摘Wireless sensor networks(WSN)gather information and sense information samples in a certain region and communicate these readings to a base station(BS).Energy efficiency is considered a major design issue in the WSNs,and can be addressed using clustering and routing techniques.Information is sent from the source to the BS via routing procedures.However,these routing protocols must ensure that packets are delivered securely,guaranteeing that neither adversaries nor unauthentic individuals have access to the sent information.Secure data transfer is intended to protect the data from illegal access,damage,or disruption.Thus,in the proposed model,secure data transmission is developed in an energy-effective manner.A low-energy adaptive clustering hierarchy(LEACH)is developed to efficiently transfer the data.For the intrusion detection systems(IDS),Fuzzy logic and artificial neural networks(ANNs)are proposed.Initially,the nodes were randomly placed in the network and initialized to gather information.To ensure fair energy dissipation between the nodes,LEACH randomly chooses cluster heads(CHs)and allocates this role to the various nodes based on a round-robin management mechanism.The intrusion-detection procedure was then utilized to determine whether intruders were present in the network.Within the WSN,a Fuzzy interference rule was utilized to distinguish the malicious nodes from legal nodes.Subsequently,an ANN was employed to distinguish the harmful nodes from suspicious nodes.The effectiveness of the proposed approach was validated using metrics that attained 97%accuracy,97%specificity,and 97%sensitivity of 95%.Thus,it was proved that the LEACH and Fuzzy-based IDS approaches are the best choices for securing data transmission in an energy-efficient manner.
基金supported by the National Magnetic Confinement Fusion Science Program of China(Nos.2018YFE0301104 and 2018YFE0301100)National Natural Science Foundation of China(Nos.12075096 and 51821005)。
文摘Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.
基金funded by the National Key Research and Development Program(Grant No.2018YFC0807804-2)。
文摘Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Western Iraq as an example,a new reservoir classification and discrimination method is established by using the K-means clustering method and the Bayesian discrimination method.These methods are applied to non-cored wells to calculate the discrimination accuracy of the reservoir type,and thus the main reasons for low accuracy of reservoir discrimination are clarified.The results show that the discrimination accuracy of reservoir type based on K-means clustering and Bayesian stepwise discrimination is strongly related to the accuracy of the core data.The discrimination accuracy rate of TypeⅠ,TypeⅡ,and TypeⅤreservoirs is found to be significantly higher than that of TypeⅢand TypeⅣreservoirs using the method of combining K-means clustering and Bayesian theory based on logging data.Although the recognition accuracy of the new methodology for the TypeⅣreservoir is low,with average accuracy the new method has reached more than 82%in the entire study area,which lays a good foundation for rapid and accurate discrimination of reservoir types and the fine evaluation of a reservoir.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy.
文摘The COVID-19 pandemic has caused an unprecedented spike in confirmed cases in 230 countries globally. In this work, a set of data from the COVID-19 coronavirus outbreak has been subjected to two well-known unsupervised learning techniques: K-means clustering and correlation. The COVID-19 virus has infected several nations, and K-means automatically looks for undiscovered clusters of those infections. To examine the spread of COVID-19 before a vaccine becomes widely available, this work has used unsupervised approaches to identify the crucial county-level confirmed cases, death cases, recover cases, total_cases_per_million, and total_deaths_per_million aspects of county-level variables. We combined countries into significant clusters using this feature subspace to assist more in-depth disease analysis efforts. As a result, we used a clustering technique to examine various trends in COVID-19 incidence and mortality across nations. This technique took the key components of a trajectory and incorporates them into a K-means clustering process. We separated the trend lines into measures that characterize various features of a trend. The measurements were first reduced in dimension, then clustered using a K-means algorithm. This method was used to individually calculate the incidence and death rates and then compare them.
文摘Many forecasting models based on the concepts of Fuzzy time series have been proposed in the past decades. These models have been widely applied to various problem domains, especially in dealing with forecasting problems in which historical data are linguistic values. In this paper, we present a new fuzzy time series forecasting model, which uses the historical data as the universe of discourse and uses the K-means clustering algorithm to cluster the universe of discourse, then adjust the clusters into intervals. The proposed method is applied for forecasting University enrollment of Alabama. It is shown that the proposed model achieves a significant improvement in forecasting accuracy as compared to other fuzzy time series forecasting models.
基金supported by the National Natural Science Foundation of China under Grant 51777193.
文摘An improved fuzzy time series algorithmbased on clustering is designed in this paper.The algorithm is successfully applied to short-term load forecasting in the distribution stations.Firstly,the K-means clustering method is used to cluster the data,and the midpoint of two adjacent clustering centers is taken as the dividing point of domain division.On this basis,the data is fuzzed to form a fuzzy time series.Secondly,a high-order fuzzy relation with multiple antecedents is established according to the main measurement indexes of power load,which is used to predict the short-term trend change of load in the distribution stations.Matlab/Simulink simulation results show that the load forecasting errors of the typical fuzzy time series on the time scale of one day and one week are[−50,20]and[−50,30],while the load forecasting errors of the improved fuzzy time series on the time scale of one day and one week are[−20,15]and[−20,25].It shows that the fuzzy time series algorithm improved by clustering improves the prediction accuracy and can effectively predict the short-term load trend of distribution stations.
基金This research is funded by Graduate University of Science and Technology under grant number GUST.STS.DT2020-TT01。
文摘Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R113)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Encephalitis is a brain inflammation disease.Encephalitis can yield to seizures,motor disability,or some loss of vision or hearing.Sometimes,encepha-litis can be a life-threatening and proper diagnosis in an early stage is very crucial.Therefore,in this paper,we are proposing a deep learning model for computerized detection of Encephalitis from the electroencephalogram data(EEG).Also,we propose a Density-Based Clustering model to classify the distinctive waves of Encephalitis.Customary clustering models usually employ a computed single centroid virtual point to define the cluster configuration,but this single point does not contain adequate information.To precisely extract accurate inner structural data,a multiple centroids approach is employed and defined in this paper,which defines the cluster configuration by allocating weights to each state in the cluster.The multiple EEG view fuzzy learning approach incorporates data from every sin-gle view to enhance the model's clustering performance.Also a fuzzy Density-Based Clustering model with multiple centroids(FDBC)is presented.This model employs multiple real state centroids to define clusters using Partitioning Around Centroids algorithm.The Experimental results validate the medical importance of the proposed clustering model.
文摘Recently,the fundamental problem with Hybrid Mobile Ad-hoc Net-works(H-MANETs)is tofind a suitable and secure way of balancing the load through Internet gateways.Moreover,the selection of the gateway and overload of the network results in packet loss and Delay(DL).For optimal performance,it is important to load balance between different gateways.As a result,a stable load balancing procedure is implemented,which selects gateways based on Fuzzy Logic(FL)and increases the efficiency of the network.In this case,since gate-ways are selected based on the number of nodes,the Energy Consumption(EC)was high.This paper presents a novel Node Quality-based Clustering Algo-rithm(NQCA)based on Fuzzy-Genetic for Cluster Head and Gateway Selection(FGCHGS).This algorithm combines NQCA with the Improved Weighted Clus-tering Algorithm(IWCA).The NQCA algorithm divides the network into clusters based upon node priority,transmission range,and neighbourfidelity.In addition,the simulation results tend to evaluate the performance effectiveness of the FFFCHGS algorithm in terms of EC,packet loss rate(PLR),etc.
文摘With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow explosively.These multi-source heterogeneous data have data differences,which lead to data variation in the process of transmission and preservation,thus forming the bad information of incomplete data.Therefore,the research on data integrity has become an urgent task.This paper is based on the characteristics of random chance and the Spatio-temporal difference of the system.According to the characteristics and data sources of the massive data generated by power equipment,the fuzzy mining model of power equipment data is established,and the data is divided into numerical and non-numerical data based on numerical data.Take the text data of power equipment defects as the mining material.Then,the Apriori algorithm based on an array is used to mine deeply.The strong association rules in incomplete data of power equipment are obtained and analyzed.From the change trend of NRMSE metrics and classification accuracy,most of the filling methods combined with the two frameworks in this method usually show a relatively stable filling trend,and will not fluctuate greatly with the growth of the missing rate.The experimental results show that the proposed algorithm model can effectively improve the filling effect of the existing filling methods on most data sets,and the filling effect fluctuates greatly with the increase of the missing rate,that is,with the increase of the missing rate,the improvement effect of the model for the existing filling methods is higher than 4.3%.Through the incomplete data clustering technology studied in this paper,a more innovative state assessment of smart grid reliability operation is carried out,which has good research value and reference significance.
基金the National Natural Science Foundation of China (60672061)
文摘Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments.
文摘The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.
文摘Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.
基金National Basic Research Program of China(973 Program)(2015CB453200),2012CB955903)National Natural Science Foundation of China(41575083,41575108)Jiangsu Education Science Foundation(13KJA170002)
文摘Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objective classification of the Northwestern Pacific tropical cyclone tracks is established by using k-means Clustering.The TC lifespan,energy,active season and landfall probability of seven clusters of tropical cyclone tracks are comparatively analyzed.The characteristics of these parameters are quite different among different tropical cyclone track clusters.From the trend of the past two decades,the frequency of the western recurving cluster(accounting for 21.3% of the total) increased,and the lifespan elongated slightly,which differs from the other clusters.The annual variation of the Power Dissipation Index(PDI) of most clusters mainly depended on the TC intensity and frequency.However,the annual variation of the PDI in the northwestern moving then recurving cluster and the pelagic west-northwest moving cluster mainly depended on the frequency.
文摘Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one of the most important technologies in network security detection.The accuracy of network intrusion detection has reached higher accuracy so far.However,these methods have very low efficiency in network intrusion detection,even the most popular SOM neural network method.In this paper,an efficient and fast network intrusion detection method was proposed.Firstly,the fundamental of the two different methods are introduced respectively.Then,the selforganizing feature map neural network based on K-means clustering(KSOM)algorithms was presented to improve the efficiency of network intrusion detection.Finally,the NSLKDD is used as network intrusion data set to demonstrate that the KSOM method can significantly reduce the number of clustering iteration than SOM method without substantially affecting the clustering results and the accuracy is much higher than Kmeans method.The Experimental results show that our method can relatively improve the accuracy of network intrusion and significantly reduce the number of clustering iteration.