For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the o...For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.展开更多
The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial s...The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.展开更多
针对电池储能系统(battery energy storage system,BESS)进行光伏波动平抑时寿命损耗高及荷电状态(state of charge,SOC)一致性差的问题,提出了光伏波动平抑下改进K-means的BESS动态分组控制策略。首先,采用最小最大调度方法获取光伏并...针对电池储能系统(battery energy storage system,BESS)进行光伏波动平抑时寿命损耗高及荷电状态(state of charge,SOC)一致性差的问题,提出了光伏波动平抑下改进K-means的BESS动态分组控制策略。首先,采用最小最大调度方法获取光伏并网指令。其次,设计了改进侏儒猫鼬优化算法(improved dwarf mongoose optimizer,IDMO),并利用它对传统K-means聚类算法进行改进,加快了聚类速度。接着,制定了电池单元动态分组原则,并根据电池单元SOC利用改进K-means将其分为3个电池组。然后,设计了基于充放电函数的电池单元SOC一致性功率分配方法,并据此提出BESS双层功率分配策略,上层确定电池组充放电顺序及指令,下层计算电池单元充放电指令。对所提策略进行仿真验证,结果表明,所设计的IDMO具有更高的寻优精度及更快的寻优速度。所提BESS平抑光伏波动策略在有效平抑波动的同时,降低了BESS运行寿命损耗并提高了电池单元SOC的均衡性。展开更多
Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,th...Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.展开更多
Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Wes...Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Western Iraq as an example,a new reservoir classification and discrimination method is established by using the K-means clustering method and the Bayesian discrimination method.These methods are applied to non-cored wells to calculate the discrimination accuracy of the reservoir type,and thus the main reasons for low accuracy of reservoir discrimination are clarified.The results show that the discrimination accuracy of reservoir type based on K-means clustering and Bayesian stepwise discrimination is strongly related to the accuracy of the core data.The discrimination accuracy rate of TypeⅠ,TypeⅡ,and TypeⅤreservoirs is found to be significantly higher than that of TypeⅢand TypeⅣreservoirs using the method of combining K-means clustering and Bayesian theory based on logging data.Although the recognition accuracy of the new methodology for the TypeⅣreservoir is low,with average accuracy the new method has reached more than 82%in the entire study area,which lays a good foundation for rapid and accurate discrimination of reservoir types and the fine evaluation of a reservoir.展开更多
文摘For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.
文摘The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack.
文摘针对电池储能系统(battery energy storage system,BESS)进行光伏波动平抑时寿命损耗高及荷电状态(state of charge,SOC)一致性差的问题,提出了光伏波动平抑下改进K-means的BESS动态分组控制策略。首先,采用最小最大调度方法获取光伏并网指令。其次,设计了改进侏儒猫鼬优化算法(improved dwarf mongoose optimizer,IDMO),并利用它对传统K-means聚类算法进行改进,加快了聚类速度。接着,制定了电池单元动态分组原则,并根据电池单元SOC利用改进K-means将其分为3个电池组。然后,设计了基于充放电函数的电池单元SOC一致性功率分配方法,并据此提出BESS双层功率分配策略,上层确定电池组充放电顺序及指令,下层计算电池单元充放电指令。对所提策略进行仿真验证,结果表明,所设计的IDMO具有更高的寻优精度及更快的寻优速度。所提BESS平抑光伏波动策略在有效平抑波动的同时,降低了BESS运行寿命损耗并提高了电池单元SOC的均衡性。
基金supported by the National Magnetic Confinement Fusion Science Program of China(Nos.2018YFE0301104 and 2018YFE0301100)National Natural Science Foundation of China(Nos.12075096 and 51821005)。
文摘Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.
基金funded by the National Key Research and Development Program(Grant No.2018YFC0807804-2)。
文摘Reservoir classification is a key link in reservoir evaluation.However,traditional manual means are inefficient,subjective,and classification standards are not uniform.Therefore,taking the Mishrif Formation of the Western Iraq as an example,a new reservoir classification and discrimination method is established by using the K-means clustering method and the Bayesian discrimination method.These methods are applied to non-cored wells to calculate the discrimination accuracy of the reservoir type,and thus the main reasons for low accuracy of reservoir discrimination are clarified.The results show that the discrimination accuracy of reservoir type based on K-means clustering and Bayesian stepwise discrimination is strongly related to the accuracy of the core data.The discrimination accuracy rate of TypeⅠ,TypeⅡ,and TypeⅤreservoirs is found to be significantly higher than that of TypeⅢand TypeⅣreservoirs using the method of combining K-means clustering and Bayesian theory based on logging data.Although the recognition accuracy of the new methodology for the TypeⅣreservoir is low,with average accuracy the new method has reached more than 82%in the entire study area,which lays a good foundation for rapid and accurate discrimination of reservoir types and the fine evaluation of a reservoir.