期刊文献+
共找到220篇文章
< 1 2 11 >
每页显示 20 50 100
Comprehensive K-Means Clustering
1
作者 Ethan Xiao 《Journal of Computer and Communications》 2024年第3期146-159,共14页
The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial s... The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack. 展开更多
关键词 k-means clustering
下载PDF
Investigation of the J-TEXT plasma events by k-means clustering algorithm 被引量:1
2
作者 李建超 张晓卿 +11 位作者 张昱 Abba Alhaji BALA 柳惠平 周帼红 王能超 李达 陈忠勇 杨州军 陈志鹏 董蛟龙 丁永华 the J-TEXT Team 《Plasma Science and Technology》 SCIE EI CAS CSCD 2023年第8期38-43,共6页
Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,th... Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data. 展开更多
关键词 k-means cluster analysis plasma event machine learning
下载PDF
Quantitative Method of Classification and Discrimination of a Porous Carbonate Reservoir Integrating K-means Clustering and Bayesian Theory
3
作者 FANG Xinxin ZHU Guotao +2 位作者 YANG Yiming LI Fengling FENG Hong 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2023年第1期176-189,共14页
Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Wes... Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Western Iraq as an example,a new reservoir classification and discrimination method is established by using the K-means clustering method and the Bayesian discrimination method.These methods are applied to non-cored wells to calculate the discrimination accuracy of the reservoir type,and thus the main reasons for low accuracy of reservoir discrimination are clarified.The results show that the discrimination accuracy of reservoir type based on K-means clustering and Bayesian stepwise discrimination is strongly related to the accuracy of the core data.The discrimination accuracy rate of TypeⅠ,TypeⅡ,and TypeⅤreservoirs is found to be significantly higher than that of TypeⅢand TypeⅣreservoirs using the method of combining K-means clustering and Bayesian theory based on logging data.Although the recognition accuracy of the new methodology for the TypeⅣreservoir is low,with average accuracy the new method has reached more than 82%in the entire study area,which lays a good foundation for rapid and accurate discrimination of reservoir types and the fine evaluation of a reservoir. 展开更多
关键词 UPSTREAM resource exploration reservoir classification CARBONATE k-means clustering Bayesian discrimination CENOMANIAN-TURONIAN Iraq
下载PDF
Plant Leaf Diseases Classification Using Improved K-Means Clustering and SVM Algorithm for Segmentation
4
作者 Mona Jamjoom Ahmed Elhadad +1 位作者 Hussein Abulkasim Safia Abbas 《Computers, Materials & Continua》 SCIE EI 2023年第7期367-382,共16页
Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease ... Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy. 展开更多
关键词 SVM machine learning GLCM algorithm k-means clustering LBP
下载PDF
Clustering Countries on COVID-19 Data among Different Waves Using K-Means Clustering
5
作者 Muhtasim   Md. Abdul Masud 《Journal of Computer and Communications》 2023年第7期1-14,共14页
The COVID-19 pandemic has caused an unprecedented spike in confirmed cases in 230 countries globally. In this work, a set of data from the COVID-19 coronavirus outbreak has been subjected to two well-known unsupervise... The COVID-19 pandemic has caused an unprecedented spike in confirmed cases in 230 countries globally. In this work, a set of data from the COVID-19 coronavirus outbreak has been subjected to two well-known unsupervised learning techniques: K-means clustering and correlation. The COVID-19 virus has infected several nations, and K-means automatically looks for undiscovered clusters of those infections. To examine the spread of COVID-19 before a vaccine becomes widely available, this work has used unsupervised approaches to identify the crucial county-level confirmed cases, death cases, recover cases, total_cases_per_million, and total_deaths_per_million aspects of county-level variables. We combined countries into significant clusters using this feature subspace to assist more in-depth disease analysis efforts. As a result, we used a clustering technique to examine various trends in COVID-19 incidence and mortality across nations. This technique took the key components of a trajectory and incorporates them into a K-means clustering process. We separated the trend lines into measures that characterize various features of a trend. The measurements were first reduced in dimension, then clustered using a K-means algorithm. This method was used to individually calculate the incidence and death rates and then compare them. 展开更多
关键词 COVID-19 Epidemic k-means clustering CORRELATIONS Infection Control SARS-CoV-2 Time Series
下载PDF
Improved k-means clustering algorithm 被引量:16
6
作者 夏士雄 李文超 +2 位作者 周勇 张磊 牛强 《Journal of Southeast University(English Edition)》 EI CAS 2007年第3期435-438,共4页
In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering a... In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower. 展开更多
关键词 clustering k-means algorithm silhouette coefficient
下载PDF
An efficient enhanced k-means clustering algorithm 被引量:30
7
作者 FAHIM A.M SALEM A.M +1 位作者 TORKEY F.A RAMADAN M.A 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第10期1626-1633,共8页
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista... In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation. 展开更多
关键词 clustering algorithms cluster analysis k-means algorithm Data analysis
下载PDF
Hierarchical hesitant fuzzy K-means clustering algorithm 被引量:21
8
作者 CHEN Na XU Ze-shui XIA Mei-mei 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2014年第1期1-17,共17页
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar... Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm. 展开更多
关键词 90B50 68T10 62H30 Hesitant fuzzy set hierarchical clustering k-means clustering intuitionisitc fuzzy set
下载PDF
K-MEANS CLUSTERING FOR CLASSIFICATION OF THE NORTHWESTERN PACIFIC TROPICAL CYCLONE TRACKS 被引量:4
9
作者 余锦华 郑颖青 +2 位作者 吴启树 林金凎 龚振彬 《Journal of Tropical Meteorology》 SCIE 2016年第2期127-135,共9页
Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objec... Based on the Joint Typhoon Warning Center(JTWC) best-track dataset between 1965 and 2009 and the characteristic parameters including tropical cyclone(TC) position,intensity,path length and direction,a method for objective classification of the Northwestern Pacific tropical cyclone tracks is established by using k-means Clustering.The TC lifespan,energy,active season and landfall probability of seven clusters of tropical cyclone tracks are comparatively analyzed.The characteristics of these parameters are quite different among different tropical cyclone track clusters.From the trend of the past two decades,the frequency of the western recurving cluster(accounting for 21.3% of the total) increased,and the lifespan elongated slightly,which differs from the other clusters.The annual variation of the Power Dissipation Index(PDI) of most clusters mainly depended on the TC intensity and frequency.However,the annual variation of the PDI in the northwestern moving then recurving cluster and the pelagic west-northwest moving cluster mainly depended on the frequency. 展开更多
关键词 tropical cyclone classification of tracks k-means clustering character of cluster
下载PDF
Classification of Northeast China Cold Vortex Activity Paths in Early Summer Based on K-means Clustering and Their Climate Impact 被引量:11
10
作者 Yihe FANG Haishan CHEN +3 位作者 Yi LIN Chunyu ZHAO Yitong LIN Fang ZHOU 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2021年第3期400-412,共13页
The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the... The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable. 展开更多
关键词 northeastern China early summer Northeast China Cold Vortex classification of activity paths machine learning method k-means clustering high-pressure blocking
下载PDF
Blind source separation by weighted K-means clustering 被引量:5
11
作者 Yi Qingming 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2008年第5期882-887,共6页
Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not ... Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments. 展开更多
关键词 blind source separation underdetermined mixing sparse representation weighted k-means clustering.
下载PDF
Optimization of constitutive parameters of foundation soils k-means clustering analysis 被引量:7
12
作者 Muge Elif Orakoglu Cevdet Emin Ekinci 《Research in Cold and Arid Regions》 CSCD 2013年第5期626-636,共11页
The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ... The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils. 展开更多
关键词 foundation soil regression model k-means clustering analysis
下载PDF
Geochemical and Geostatistical Studies for Estimating Gold Grade in Tarq Prospect Area by K-Means Clustering Method 被引量:7
13
作者 Adel Shirazy Aref Shirazi +1 位作者 Mohammad Hossein Ferdossi Mansour Ziaii 《Open Journal of Geology》 2019年第6期306-326,共21页
Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Qu... Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points. 展开更多
关键词 GOLD Tarq k-means clustering Method Estimation of the Elements GRADE k-means
下载PDF
Application of Self-Organizing Feature Map Neural Network Based on K-means Clustering in Network Intrusion Detection 被引量:5
14
作者 Ling Tan Chong Li +1 位作者 Jingming Xia Jun Cao 《Computers, Materials & Continua》 SCIE EI 2019年第7期275-288,共14页
Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one... Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one of the most important technologies in network security detection.The accuracy of network intrusion detection has reached higher accuracy so far.However,these methods have very low efficiency in network intrusion detection,even the most popular SOM neural network method.In this paper,an efficient and fast network intrusion detection method was proposed.Firstly,the fundamental of the two different methods are introduced respectively.Then,the selforganizing feature map neural network based on K-means clustering(KSOM)algorithms was presented to improve the efficiency of network intrusion detection.Finally,the NSLKDD is used as network intrusion data set to demonstrate that the KSOM method can significantly reduce the number of clustering iteration than SOM method without substantially affecting the clustering results and the accuracy is much higher than Kmeans method.The Experimental results show that our method can relatively improve the accuracy of network intrusion and significantly reduce the number of clustering iteration. 展开更多
关键词 k-means clustering self-organizing feature map neural network network security intrusion detection NSL-KDD data set
下载PDF
K-means Find Density Peaks in Molecular Conformation Clustering 被引量:1
15
作者 Guiyan Wang Ting Fu +5 位作者 Hong Ren Peijun Xu Qiuhan Guo Xiaohong Mou Yan Li Guohui Li 《Chinese Journal of Chemical Physics》 SCIE EI CAS CSCD 2022年第2期353-368,I0026-I0030,I0003,共22页
Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformat... Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP. 展开更多
关键词 k-means find density peaks Molecular clustering Density-based spatial clustering of applications with noise
下载PDF
A K-means clustering based blind multiband spectrum sensing algorithm for cognitive radio 被引量:2
16
作者 LEI Ke-jun TAN Yang-hong +1 位作者 YANG Xi WANG Han-rui 《Journal of Central South University》 SCIE EI CAS CSCD 2018年第10期2451-2461,共11页
In this paper,a blind multiband spectrum sensing(BMSS)method requiring no knowledge of noise power,primary signal and wireless channel is proposed based on the K-means clustering(KMC).In this approach,the KMC algorith... In this paper,a blind multiband spectrum sensing(BMSS)method requiring no knowledge of noise power,primary signal and wireless channel is proposed based on the K-means clustering(KMC).In this approach,the KMC algorithm is used to identify the occupied subband set(OSS)and the idle subband set(ISS),and then the location and number information of the occupied channels are obtained according to the elements in the OSS.Compared with the classical BMSS methods based on the information theoretic criteria(ITC),the new method shows more excellent performance especially in the low signal-to-noise ratio(SNR)and the small sampling number scenarios,and more robust detection performance in noise uncertainty or unequal noise variance applications.Meanwhile,the new method performs more stablely than the ITC-based methods when the occupied subband number increases or the primary signals suffer multi-path fading.Simulation result verifies the effectiveness of the proposed method. 展开更多
关键词 cognitive radio(CR) blind multiband spectrum sensing(BMSS) k-means clustering(KMC) occupied subband set(OSS) idle subband set(ISS) information theoretic criteria(ITC) noise uncertainty
下载PDF
Development of slope mass rating system using K-means and fuzzy c-means clustering algorithms 被引量:1
17
作者 Jalali Zakaria 《International Journal of Mining Science and Technology》 SCIE EI CSCD 2016年第6期959-966,共8页
Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experien... Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions. 展开更多
关键词 SMR based on continuous functions Slope stability analysis k-means and FCM clustering algorithms Validation of clustering algorithms Sangan iron ore mines
下载PDF
Statistical prediction of waterflooding performance by K-means clustering and empirical modeling
18
作者 Qin-Zhuo Liao Liang Xue +3 位作者 Gang Lei Xu Liu Shu-Yu Sun Shirish Patil 《Petroleum Science》 SCIE CAS CSCD 2022年第3期1139-1152,共14页
Statistical prediction is often required in reservoir simulation to quantify production uncertainty or assess potential risks.Most existing uncertainty quantification procedures aim to decompose the input random field... Statistical prediction is often required in reservoir simulation to quantify production uncertainty or assess potential risks.Most existing uncertainty quantification procedures aim to decompose the input random field to independent random variables,and may suffer from the curse of dimensionality if the correlation scale is small compared to the domain size.In this work,we develop and test a new approach,K-means clustering assisted empirical modeling,for efficiently estimating waterflooding performance for multiple geological realizations.This method performs single-phase flow simulations in a large number of realizations,and uses K-means clustering to select only a few representatives,on which the two-phase flow simulations are implemented.The empirical models are then adopted to describe the relation between the single-phase solutions and the two-phase solutions using these representatives.Finally,the two-phase solutions in all realizations can be predicted using the empirical models readily.The method is applied to both 2D and 3D synthetic models and is shown to perform well in the P10,P50 and P90 of production rates,as well as the probability distributions as illustrated by cumulative density functions.It is able to capture the ensemble statistics of the Monte Carlo simulation results with a large number of realizations,and the computational cost is significantly reduced. 展开更多
关键词 WATERFLOODING Statistical prediction k-means clustering Empirical modeling Uncertainty quantification
下载PDF
Similarity matrix-based K-means algorithm for text clustering
19
作者 曹奇敏 郭巧 吴向华 《Journal of Beijing Institute of Technology》 EI CAS 2015年第4期566-572,共7页
K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper propo... K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper proposes an improved K-means algorithm based on the similarity matrix. The im- proved algorithm can effectively avoid the random selection of initial center points, therefore it can provide effective initial points for clustering process, and reduce the fluctuation of clustering results which are resulted from initial points selections, thus a better clustering quality can be obtained. The experimental results also show that the F-measure of the improved K-means algorithm has been greatly improved and the clustering results are more stable. 展开更多
关键词 text clustering k-means algorithm similarity matrix F-MEASURE
下载PDF
Oversampling Method Based on Gaussian Distribution and K-Means Clustering
20
作者 Masoud Muhammed Hassan Adel Sabry Eesa +1 位作者 Ahmed Jameel Mohammed Wahab Kh.Arabo 《Computers, Materials & Continua》 SCIE EI 2021年第10期451-469,共19页
Learning from imbalanced data is one of the greatest challenging problems in binary classification,and this problem has gained more importance in recent years.When the class distribution is imbalanced,classical machin... Learning from imbalanced data is one of the greatest challenging problems in binary classification,and this problem has gained more importance in recent years.When the class distribution is imbalanced,classical machine learning algorithms tend to move strongly towards the majority class and disregard the minority.Therefore,the accuracy may be high,but the model cannot recognize data instances in the minority class to classify them,leading to many misclassifications.Different methods have been proposed in the literature to handle the imbalance problem,but most are complicated and tend to simulate unnecessary noise.In this paper,we propose a simple oversampling method based on Multivariate Gaussian distribution and K-means clustering,called GK-Means.The new method aims to avoid generating noise and control imbalances between and within classes.Various experiments have been carried out with six classifiers and four oversampling methods.Experimental results on different imbalanced datasets show that the proposed GK-Means outperforms other oversampling methods and improves classification performance as measured by F1-score and Accuracy. 展开更多
关键词 Class imbalance OVERSAMPLING GAUSSIAN multivariate distribution k-means clustering
下载PDF
上一页 1 2 11 下一页 到第
使用帮助 返回顶部