The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ...The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.展开更多
Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Qu...Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.展开更多
A total of 10 indices of regional economic development in Guangxi are selected.According to the relevant economic data,regional economic development in Guangxi is analyzed by using System Clustering Method and Princip...A total of 10 indices of regional economic development in Guangxi are selected.According to the relevant economic data,regional economic development in Guangxi is analyzed by using System Clustering Method and Principal Component Analysis Method.Result shows that System Clustering Method and Principal Component Analysis Method have revealed similar results analysis of economic development level.Overall economic strength of Guangxi is weak and Nanning has relatively high scores of factors due to its advantage of the political,economic and cultural center.Comprehensive scores of other regions are all lower than 1,which has big gap with the development of Nanning.Overall development strategy points out that Guangxi should accelerate the construction of the Ring Northern Bay Economic Zone,create a strong logistics system having strategic significance to national development,use the unique location advantage and rely on the modern transportation system to establish a logistics center and business center connecting the hinterland and the Asean Market.Based on the problems of unbalanced regional economic development in Guangxi,we should speed up the development of service industry in Nanning,construct the circular economy system of industrial city,and accelerate the industrialization process of tourism city in order to realize balanced development of regional economy in Guangxi,China.展开更多
The problem of taking a set of data and separating it into subgroups where the elements of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied throug...The problem of taking a set of data and separating it into subgroups where the elements of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied through the statistical method of cluster analysis. In this paper we want to discuss the application of this method to the field of education: particularly, we want to present the use of cluster analysis to separate students into groups that can be recognized and characterized by common traits in their answers to a questionnaire, without any prior knowledge of what form those groups would take (unsupervised classification). We start from a detailed study of the data processing needed by cluster analysis. Then two methods commonly used in cluster analysis are before described only from a theoretical point a view and after in the Section 4 through an example of application to data coming from an open-ended questionnaire administered to a sample of university students. In particular we describe and criticize the variables and parameters used to show the results of the cluster analysis methods.展开更多
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista...In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.展开更多
The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the...The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.展开更多
Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,th...Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.展开更多
An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public ...An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public satisfaction survey data obtained in Wafangdian,China in 2010,this study investigates the suitability of fuzzy clustering analysis method in establishing an evaluation index.Through quantitative analysis of multilayer fuzzy clustering of various evaluation indicators,correlation analysis indicates that if the results of clustering were identical for two evaluation indicators in the same sub-evaluation layer,then one indicator could be removed,or the two indicators merged.For evaluation indicators in different sub-evaluation layers,although clustering reveals attribute correlations,these indicators may not be substituted for one another.Analysis of the applicability of the fuzzy clustering method shows that it plays a certain role in the establishment and correction of an evaluation index.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram o...Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram of the signal is calculated based on a multi-window T-F analysis,and a speech test statistic is constructed based on the characteristic difference between the signal and background noise.Second,the dynamic double-threshold processing is used for preliminary detection,and then the global double-threshold value is obtained using K-means clustering.Finally,the detection results are obtained by sequential decision.The experimental results show that the overall performance of the method is better than that of traditional methods under various SNR conditions and background noises.This method also has the advantages of low complexity,strong robustness,and adaptability to multi-national languages.展开更多
The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allo...The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allows clustering variable objects into groups-clusters on the basis of similarity or dissimilarity. Cluster analysis involves computational procedures, of which purpose is to reduce a set of data on several relatively homogenous groups-clusters, while the condition of reduction is maximal and simultaneously minimal similarity of clusters. Similarity of objects is studied by the degree of similarity (correlation coefficient and association coefficient) or the degree of dissimilarity-degree of distance (distance coefficient). Methods of cluster analysis are on the basis of clustering classified as hierarchical or non-hierarchical methods.展开更多
The K-means method is one of the most widely used clustering methods and has been implemented in many fields of science and technology. One of the major problems of the k-means algorithm is that it may produce empty c...The K-means method is one of the most widely used clustering methods and has been implemented in many fields of science and technology. One of the major problems of the k-means algorithm is that it may produce empty clusters depending on initial center vectors. Genetic Algorithms (GAs) are adaptive heuristic search algorithm based on the evolutionary principles of natural selection and genetics. This paper presents a hybrid version of the k-means algorithm with GAs that efficiently eliminates this empty cluster problem. Results of simulation experiments using several data sets prove our claim.展开更多
Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant ...Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant challenge, as the need for robust security measures becomes increasingly imperative. This paper presented an innovative method based on differential analyses to detect abrupt changes in network traffic characteristics. The core concept revolves around identifying abrupt alterations in certain characteristics such as input/output volume, the number of TCP connections, or DNS queries—within the analyzed traffic. Initially, the traffic is segmented into distinct sequences of slices, followed by quantifying specific characteristics for each slice. Subsequently, the distance between successive values of these measured characteristics is computed and clustered to detect sudden changes. To accomplish its objectives, the approach combined several techniques, including propositional logic, distance metrics (e.g., Kullback-Leibler Divergence), and clustering algorithms (e.g., K-means). When applied to two distinct datasets, the proposed approach demonstrates exceptional performance, achieving detection rates of up to 100%.展开更多
Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experien...Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.展开更多
The original temporal clustering analysis (OTCA) is an effective technique for obtaining brain activation maps when the timing and location of the activation are completely unknown, but its deficiency of sensitivity i...The original temporal clustering analysis (OTCA) is an effective technique for obtaining brain activation maps when the timing and location of the activation are completely unknown, but its deficiency of sensitivity is exposed in processing brain activation signal which is relatively weak. The time slice analysis method based on OTCA is proposed considering the weakness of the functional magnetic resonance imaging (fMRI) signal of the rat model. By dividing the stimulation period into several time slices and analyzing each slice to detect the activated pixels respectively after the background removal, the sensitivity is significantly improved. The inhibitory response in the hypothalamus after glucose loading is detected successfully with this method in the experiment on rat. Combined with the OTCA method, the time slice analysis method based on OTCA is effective on detecting when, where and which type of response will happen after stimulation, even if the fMRI signal is weak.展开更多
Supervised learning methods(eg.PLS-DA,SVM,etc.) have been widely used with laser-induced breakdown spectroscopy(LIBS) to classify materials;however,it may induce a low correct classification rate if a test sample ...Supervised learning methods(eg.PLS-DA,SVM,etc.) have been widely used with laser-induced breakdown spectroscopy(LIBS) to classify materials;however,it may induce a low correct classification rate if a test sample type is not included in the training dataset.Unsupervised cluster analysis methods(hierarchical clustering analysis,K-means clustering analysis,and iterative self-organizing data analysis technique) are investigated in plastics classification based on the line intensities of LIBS emission in this paper.The results of hierarchical clustering analysis using four different similarity measuring methods(single linkage,complete linkage,unweighted pair-group average,and weighted pair-group average) are compared.In K-means clustering analysis,four kinds of choosing initial centers methods are applied in our case and their results are compared.The classification results of hierarchical clustering analysis,K-means clustering analysis,and ISODATA are analyzed.The experiment results demonstrated cluster analysis methods can be applied to plastics discrimination with LIBS.展开更多
On the process of power system black start after an accident, it can help to optimize the resources allocation and accelerate the recovery process that decomposing the power system into several independent partitions ...On the process of power system black start after an accident, it can help to optimize the resources allocation and accelerate the recovery process that decomposing the power system into several independent partitions for parallel recovery. On the basis of adequate consideration of fuzziness of black-start zone partitioning, a new algorithm based on fuzzy clustering analysis is presented. Characteristic indexes are extracted fully and accurately. The raw data matrix is made up of the electrical distance between every nodes and blackstart resources. Closure transfer method is utilized to get the dynamic clustering. The availability and feasibility of the proposed algorithm are verified on the New-England 39 bus system at last.展开更多
This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverag...This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.展开更多
In this paper, we apply clustering analysis of data mining into power system. We adapt K-means clustering algorithm to analyze customer load, analyzing similar behavior between customer of electricity, and we adapt pr...In this paper, we apply clustering analysis of data mining into power system. We adapt K-means clustering algorithm to analyze customer load, analyzing similar behavior between customer of electricity, and we adapt principal component analysis to get the clustering result visible, Simulation and analysis using matlab, and this well verify cluster rationality. The conclusion of this paper can provide important basis to the peak for the power system, stable operation the power system security.展开更多
文摘The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.
文摘Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.
文摘A total of 10 indices of regional economic development in Guangxi are selected.According to the relevant economic data,regional economic development in Guangxi is analyzed by using System Clustering Method and Principal Component Analysis Method.Result shows that System Clustering Method and Principal Component Analysis Method have revealed similar results analysis of economic development level.Overall economic strength of Guangxi is weak and Nanning has relatively high scores of factors due to its advantage of the political,economic and cultural center.Comprehensive scores of other regions are all lower than 1,which has big gap with the development of Nanning.Overall development strategy points out that Guangxi should accelerate the construction of the Ring Northern Bay Economic Zone,create a strong logistics system having strategic significance to national development,use the unique location advantage and rely on the modern transportation system to establish a logistics center and business center connecting the hinterland and the Asean Market.Based on the problems of unbalanced regional economic development in Guangxi,we should speed up the development of service industry in Nanning,construct the circular economy system of industrial city,and accelerate the industrialization process of tourism city in order to realize balanced development of regional economy in Guangxi,China.
文摘The problem of taking a set of data and separating it into subgroups where the elements of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied through the statistical method of cluster analysis. In this paper we want to discuss the application of this method to the field of education: particularly, we want to present the use of cluster analysis to separate students into groups that can be recognized and characterized by common traits in their answers to a questionnaire, without any prior knowledge of what form those groups would take (unsupervised classification). We start from a detailed study of the data processing needed by cluster analysis. Then two methods commonly used in cluster analysis are before described only from a theoretical point a view and after in the Section 4 through an example of application to data coming from an open-ended questionnaire administered to a sample of university students. In particular we describe and criticize the variables and parameters used to show the results of the cluster analysis methods.
文摘In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
基金This research was jointly supported by the National Natural Science Foundation of China(Grant No.42005037)the Liaoning Provincial Natural Science Foundation Project(PhD Start-up Research Fund 2019-BS-214),the Special Scientific Research Project for the Forecaster(Grant No.CMAYBY2018-018)+2 种基金a Key Technical Project of Liaoning Meteorological Bureau(Grant No.LNGJ201903)the National Key Research and Development Project(Grant No.2018YFC1505601)the Open Foundation Project of the Institute of Atmospheric Environment,China Meteorological Administration(Grant Nos.2020SYIAE08 and 2020SYIAEZD5).
文摘The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.
基金supported by the National Magnetic Confinement Fusion Science Program of China(Nos.2018YFE0301104 and 2018YFE0301100)National Natural Science Foundation of China(Nos.12075096 and 51821005)。
文摘Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data.
基金National Science Foundation of China(91637105,41775048 and 41475041)National Key R&D Program of China(2018YFC1507800)Research on Tourism Traffic Meteorological Service Products in Heilongjiang Province(HQZD2017004)
文摘An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public satisfaction survey data obtained in Wafangdian,China in 2010,this study investigates the suitability of fuzzy clustering analysis method in establishing an evaluation index.Through quantitative analysis of multilayer fuzzy clustering of various evaluation indicators,correlation analysis indicates that if the results of clustering were identical for two evaluation indicators in the same sub-evaluation layer,then one indicator could be removed,or the two indicators merged.For evaluation indicators in different sub-evaluation layers,although clustering reveals attribute correlations,these indicators may not be substituted for one another.Analysis of the applicability of the fuzzy clustering method shows that it plays a certain role in the establishment and correction of an evaluation index.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
基金The National Natural Science Foundation of China(No.12174053,91938203,11674057,11874109)the Fundamental Research Funds for the Central Universities(No.2242021k30019).
文摘Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram of the signal is calculated based on a multi-window T-F analysis,and a speech test statistic is constructed based on the characteristic difference between the signal and background noise.Second,the dynamic double-threshold processing is used for preliminary detection,and then the global double-threshold value is obtained using K-means clustering.Finally,the detection results are obtained by sequential decision.The experimental results show that the overall performance of the method is better than that of traditional methods under various SNR conditions and background noises.This method also has the advantages of low complexity,strong robustness,and adaptability to multi-national languages.
文摘The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allows clustering variable objects into groups-clusters on the basis of similarity or dissimilarity. Cluster analysis involves computational procedures, of which purpose is to reduce a set of data on several relatively homogenous groups-clusters, while the condition of reduction is maximal and simultaneously minimal similarity of clusters. Similarity of objects is studied by the degree of similarity (correlation coefficient and association coefficient) or the degree of dissimilarity-degree of distance (distance coefficient). Methods of cluster analysis are on the basis of clustering classified as hierarchical or non-hierarchical methods.
文摘The K-means method is one of the most widely used clustering methods and has been implemented in many fields of science and technology. One of the major problems of the k-means algorithm is that it may produce empty clusters depending on initial center vectors. Genetic Algorithms (GAs) are adaptive heuristic search algorithm based on the evolutionary principles of natural selection and genetics. This paper presents a hybrid version of the k-means algorithm with GAs that efficiently eliminates this empty cluster problem. Results of simulation experiments using several data sets prove our claim.
文摘Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant challenge, as the need for robust security measures becomes increasingly imperative. This paper presented an innovative method based on differential analyses to detect abrupt changes in network traffic characteristics. The core concept revolves around identifying abrupt alterations in certain characteristics such as input/output volume, the number of TCP connections, or DNS queries—within the analyzed traffic. Initially, the traffic is segmented into distinct sequences of slices, followed by quantifying specific characteristics for each slice. Subsequently, the distance between successive values of these measured characteristics is computed and clustered to detect sudden changes. To accomplish its objectives, the approach combined several techniques, including propositional logic, distance metrics (e.g., Kullback-Leibler Divergence), and clustering algorithms (e.g., K-means). When applied to two distinct datasets, the proposed approach demonstrates exceptional performance, achieving detection rates of up to 100%.
文摘Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.
基金the National Natural Science Foundation of China (30370432)
文摘The original temporal clustering analysis (OTCA) is an effective technique for obtaining brain activation maps when the timing and location of the activation are completely unknown, but its deficiency of sensitivity is exposed in processing brain activation signal which is relatively weak. The time slice analysis method based on OTCA is proposed considering the weakness of the functional magnetic resonance imaging (fMRI) signal of the rat model. By dividing the stimulation period into several time slices and analyzing each slice to detect the activated pixels respectively after the background removal, the sensitivity is significantly improved. The inhibitory response in the hypothalamus after glucose loading is detected successfully with this method in the experiment on rat. Combined with the OTCA method, the time slice analysis method based on OTCA is effective on detecting when, where and which type of response will happen after stimulation, even if the fMRI signal is weak.
基金supported by Beijing Natural Science Foundation of China(No.4132063)
文摘Supervised learning methods(eg.PLS-DA,SVM,etc.) have been widely used with laser-induced breakdown spectroscopy(LIBS) to classify materials;however,it may induce a low correct classification rate if a test sample type is not included in the training dataset.Unsupervised cluster analysis methods(hierarchical clustering analysis,K-means clustering analysis,and iterative self-organizing data analysis technique) are investigated in plastics classification based on the line intensities of LIBS emission in this paper.The results of hierarchical clustering analysis using four different similarity measuring methods(single linkage,complete linkage,unweighted pair-group average,and weighted pair-group average) are compared.In K-means clustering analysis,four kinds of choosing initial centers methods are applied in our case and their results are compared.The classification results of hierarchical clustering analysis,K-means clustering analysis,and ISODATA are analyzed.The experiment results demonstrated cluster analysis methods can be applied to plastics discrimination with LIBS.
文摘On the process of power system black start after an accident, it can help to optimize the resources allocation and accelerate the recovery process that decomposing the power system into several independent partitions for parallel recovery. On the basis of adequate consideration of fuzziness of black-start zone partitioning, a new algorithm based on fuzzy clustering analysis is presented. Characteristic indexes are extracted fully and accurately. The raw data matrix is made up of the electrical distance between every nodes and blackstart resources. Closure transfer method is utilized to get the dynamic clustering. The availability and feasibility of the proposed algorithm are verified on the New-England 39 bus system at last.
基金Funded by 973 Program of Ministry of National Defense of China(Grant No.613237)
文摘This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.
文摘In this paper, we apply clustering analysis of data mining into power system. We adapt K-means clustering algorithm to analyze customer load, analyzing similar behavior between customer of electricity, and we adapt principal component analysis to get the clustering result visible, Simulation and analysis using matlab, and this well verify cluster rationality. The conclusion of this paper can provide important basis to the peak for the power system, stable operation the power system security.