期刊文献+
共找到29,763篇文章
< 1 2 250 >
每页显示 20 50 100
ARHCS (Automatic Rainfall Half-Life Cluster System): A Landslides Early Warning System (LEWS) Using Cluster Analysis and Automatic Threshold Definition
1
作者 Cassiano Antonio Bortolozo Luana Albertani Pampuch +8 位作者 Marcio Roberto Magalhães De Andrade Daniel Metodiev Adenilson Roberto Carvalho Tatiana Sussel Gonçalves Mendes Tristan Pryer Harideva Marturano Egas Rodolfo Moreda Mendes Isadora Araújo Sousa Jenny Power 《International Journal of Geosciences》 CAS 2024年第1期54-69,共16页
A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in vari... A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters. 展开更多
关键词 Landslides Early Warning System (LEWS) cluster analysis LANDSLIDES Brazil
下载PDF
Composition Analysis and Identification of Ancient Glass Products Based on L1 Regularization Logistic Regression
2
作者 Yuqiao Zhou Xinyang Xu Wenjing Ma 《Applied Mathematics》 2024年第1期51-64,共14页
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste... In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics. 展开更多
关键词 Glass Composition L1 Regularization Logistic Regression Model k-means clustering analysis Elbow Rule Parameter Verification
下载PDF
Study Progress Analysis of Effluent Quality Prediction in Activated Sludge Process Based on CiteSpace
3
作者 Kemeng Xue 《Journal of Water Resource and Protection》 CAS 2024年第6期450-465,共16页
In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge pr... In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research. 展开更多
关键词 Biological Model Effluent Quality Prediction Activated Sludge Process CITESPACE Knowledge Map Co-Citation cluster analysis
下载PDF
Exploring Motor Imagery EEG: Enhanced EEG Microstate Analysis with GMD-Driven Density Canopy Method
4
作者 Xin Xiong Jing Zhang +3 位作者 Sanli Yi Chunwu Wang Ruixiang Liu Jianfeng He 《Computers, Materials & Continua》 SCIE EI 2024年第6期4659-4681,共23页
The analysis of microstates in EEG signals is a crucial technique for understanding the spatiotemporal dynamics of brain electrical activity.Traditional methods such as Atomic Agglomerative Hierarchical Clustering(AAH... The analysis of microstates in EEG signals is a crucial technique for understanding the spatiotemporal dynamics of brain electrical activity.Traditional methods such as Atomic Agglomerative Hierarchical Clustering(AAHC),K-means clustering,Principal Component Analysis(PCA),and Independent Component Analysis(ICA)are limited by a fixed number of microstate maps and insufficient capability in cross-task feature extraction.Tackling these limitations,this study introduces a Global Map Dissimilarity(GMD)-driven density canopy K-means clustering algorithm.This innovative approach autonomously determines the optimal number of EEG microstate topographies and employs Gaussian kernel density estimation alongside the GMD index for dynamic modeling of EEG data.Utilizing this advanced algorithm,the study analyzes the Motor Imagery(MI)dataset from the GigaScience database,GigaDB.The findings reveal six distinct microstates during actual right-hand movement and five microstates across other task conditions,with microstate C showing superior performance in all task states.During imagined movement,microstate A was significantly enhanced.Comparison with existing algorithms indicates a significant improvement in clustering performance by the refined method,with an average Calinski-Harabasz Index(CHI)of 35517.29 and a Davis-Bouldin Index(DBI)average of 2.57.Furthermore,an information-theoretical analysis of the microstate sequences suggests that imagined movement exhibits higher complexity and disorder than actual movement.By utilizing the extracted microstate sequence parameters as features,the improved algorithm achieved a classification accuracy of 98.41%in EEG signal categorization for motor imagery.A performance of 78.183%accuracy was achieved in a four-class motor imagery task on the BCI-IV-2a dataset.These results demonstrate the potential of the advanced algorithm in microstate analysis,offering a more effective tool for a deeper understanding of the spatiotemporal features of EEG signals. 展开更多
关键词 EEG microstate motor imagery k-means clustering algorithm gaus sian kernel function shannon entropy Lempel-Ziv complexity
下载PDF
Analysis of the Employment Situation of Non Private Enterprises in Various Regions of China
5
作者 Junyi Wang 《Open Journal of Applied Sciences》 2024年第1期131-144,共14页
In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level.... In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level. Solving the problem of employment for the people is an important prerequisite for their peaceful living and work, as well as a prerequisite and foundation for building a harmonious society. The employment situation of private enterprises has always been of great concern to the outside world, and these two major jobs have always occupied an important position in the employment field of China that cannot be ignored. With the establishment of the market economy system, individual and private enterprises have become important components of the socialist economy, making significant contributions to economic development and social progress. The rapid development of China’s economy, on the one hand, is the embodiment of the superiority of China’s socialist market economic system, and on the other hand, it is the role of the tertiary industry and private enterprises in promoting the national economy. Since the 1990s, China’s private enterprises have become a new economic growth point for local and even national countries, and are one of the important ways to arrange employment and achieve social stability. This paper studies the employment of private enterprises and individuals from the perspective of statistics, extracts relevant data from China statistical Yearbook, uses the relevant knowledge of statistics to process the data, obtains the conclusion and puts forward relevant constructive suggestions. 展开更多
关键词 Correlation analysis of Employment Numbers Factor analysis Principal Component analysis cluster analysis
下载PDF
Incident Detection Based on Differential Analysis
6
作者 Mohammed Ali Elseddig Mohamed Mejri 《Journal of Information Security》 2024年第3期378-409,共32页
Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant ... Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant challenge, as the need for robust security measures becomes increasingly imperative. This paper presented an innovative method based on differential analyses to detect abrupt changes in network traffic characteristics. The core concept revolves around identifying abrupt alterations in certain characteristics such as input/output volume, the number of TCP connections, or DNS queries—within the analyzed traffic. Initially, the traffic is segmented into distinct sequences of slices, followed by quantifying specific characteristics for each slice. Subsequently, the distance between successive values of these measured characteristics is computed and clustered to detect sudden changes. To accomplish its objectives, the approach combined several techniques, including propositional logic, distance metrics (e.g., Kullback-Leibler Divergence), and clustering algorithms (e.g., K-means). When applied to two distinct datasets, the proposed approach demonstrates exceptional performance, achieving detection rates of up to 100%. 展开更多
关键词 IDS SOC SIEM KL-Divergence k-mean clustering Algorithms Elbow Method
下载PDF
Comprehensive K-Means Clustering
7
作者 Ethan Xiao 《Journal of Computer and Communications》 2024年第3期146-159,共14页
The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial s... The k-means algorithm is a popular data clustering technique due to its speed and simplicity. However, it is susceptible to issues such as sensitivity to the chosen seeds, and inaccurate clusters due to poor initial seeds, particularly in complex datasets or datasets with non-spherical clusters. In this paper, a Comprehensive K-Means Clustering algorithm is presented, in which multiple trials of k-means are performed on a given dataset. The clustering results from each trial are transformed into a five-dimensional data point, containing the scope values of the x and y coordinates of the clusters along with the number of points within that cluster. A graph is then generated displaying the configuration of these points using Principal Component Analysis (PCA), from which we can observe and determine the common clustering patterns in the dataset. The robustness and strength of these patterns are then examined by observing the variance of the results of each trial, wherein a different subset of the data keeping a certain percentage of original data points is clustered. By aggregating information from multiple trials, we can distinguish clusters that consistently emerge across different runs from those that are more sensitive or unlikely, hence deriving more reliable conclusions about the underlying structure of complex datasets. Our experiments show that our algorithm is able to find the most common associations between different dimensions of data over multiple trials, often more accurately than other algorithms, as well as measure stability of these clusters, an ability that other k-means algorithms lack. 展开更多
关键词 k-means clustering
下载PDF
Slope deformation partitioning and monitoring points optimization based on cluster analysis
8
作者 LI Yuan-zheng SHEN Jun-hui +3 位作者 ZHANG Wei-xin ZHANG Kai-qiang PENG Zhang-hai HUANG Meng 《Journal of Mountain Science》 SCIE CSCD 2023年第8期2405-2421,共17页
The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine... The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine the number and location of monitoring points according to the actual deformation characteristics of the slope.There are still some defects in the layout of monitoring points.To this end,based on displacement data series and spatial location information of surface displacement monitoring points,by combining displacement series correlation and spatial distance influence factors,a spatial deformation correlation calculation model of slope based on clustering analysis was proposed to calculate the correlation between different monitoring points,based on which the deformation area of the slope was divided.The redundant monitoring points in each partition were eliminated based on the partition's outcome,and the overall optimal arrangement of slope monitoring points was then achieved.This method scientifically addresses the issues of slope deformation zoning and data gathering overlap.It not only eliminates human subjectivity from slope deformation zoning but also increases the efficiency and accuracy of slope monitoring.In order to verify the effectiveness of the method,a sand-mudstone interbedded CounterTilt excavation slope in the Chongqing city of China was used as the research object.Twenty-four monitoring points deployed on this slope were monitored for surface displacement for 13 months.The spatial location of the monitoring points was discussed.The results show that the proposed method of slope deformation zoning and the optimized placement of monitoring points are feasible. 展开更多
关键词 Excavation slope Surface displacement monitoring Spatial deformation analysis clustering analysis Slope deformation partitioning Monitoring point optimization
下载PDF
A Novel Cluster Analysis-Based Crop Dataset Recommendation Method in Precision Farming
9
作者 K.R.Naveen Kumar Husam Lahza +4 位作者 B.R.Sreenivasa Tawfeeq Shawly Ahmed A.Alsheikhy H.Arunkumar C.R.Nirmala 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3239-3260,共22页
Data mining and analytics involve inspecting and modeling large pre-existing datasets to discover decision-making information.Precision agriculture uses datamining to advance agricultural developments.Many farmers are... Data mining and analytics involve inspecting and modeling large pre-existing datasets to discover decision-making information.Precision agriculture uses datamining to advance agricultural developments.Many farmers aren’t getting the most out of their land because they don’t use precision agriculture.They harvest crops without a well-planned recommendation system.Future crop production is calculated by combining environmental conditions and management behavior,yielding numerical and categorical data.Most existing research still needs to address data preprocessing and crop categorization/classification.Furthermore,statistical analysis receives less attention,despite producing more accurate and valid results.The study was conducted on a dataset about Karnataka state,India,with crops of eight parameters taken into account,namely the minimum amount of fertilizers required,such as nitrogen,phosphorus,potassium,and pH values.The research considers rainfall,season,soil type,and temperature parameters to provide precise cultivation recommendations for high productivity.The presented algorithm converts discrete numerals to factors first,then reduces levels.Second,the algorithm generates six datasets,two fromCase-1(dataset withmany numeric variables),two from Case-2(dataset with many categorical variables),and one from Case-3(dataset with reduced factor variables).Finally,the algorithm outputs a class membership allocation based on an extended version of the K-means partitioning method with lambda estimation.The presented work produces mixed-type datasets with precisely categorized crops by organizing data based on environmental conditions,soil nutrients,and geo-location.Finally,the prepared dataset solves the classification problem,leading to a model evaluation that selects the best dataset for precise crop prediction. 展开更多
关键词 Data mining crop prediction k-prototypes k-meanS cluster machine learning
下载PDF
Fuzzy cluster analysis of water mass in the western Taiwan Strait in spring 2019
10
作者 Zhiyuan Hu Jia Zhu +4 位作者 Longqi Yang Zhenyu Sun Xin Guo Zhaozhang Chen Linfeng Huang 《Acta Oceanologica Sinica》 SCIE CAS CSCD 2023年第12期1-8,共8页
The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the wester... The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait. 展开更多
关键词 water mass classification western Taiwan Strait fuzzy cluster analysis T-S similarity number
下载PDF
Integrated classification method of tight sandstone reservoir based on principal component analysise simulated annealing genetic algorithmefuzzy cluster means
11
作者 Bo-Han Wu Ran-Hong Xie +3 位作者 Li-Zhi Xiao Jiang-Feng Guo Guo-Wen Jin Jian-Wei Fu 《Petroleum Science》 SCIE EI CSCD 2023年第5期2747-2758,共12页
In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig... In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method. 展开更多
关键词 Tight sandstone Integrated reservoir classification Principal component analysis Simulated annealing genetic algorithm Fuzzy cluster means
下载PDF
Optimization of constitutive parameters of foundation soils k-means clustering analysis 被引量:7
12
作者 Muge Elif Orakoglu Cevdet Emin Ekinci 《Research in Cold and Arid Regions》 CSCD 2013年第5期626-636,共11页
The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ... The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils. 展开更多
关键词 foundation soil regression model k-means clustering analysis
下载PDF
CPSO: Chaotic Particle Swarm Optimization for Cluster Analysis
13
作者 Jiaji Wang 《Journal of Artificial Intelligence and Technology》 2023年第2期46-52,共7页
Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel meth... Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel method based on CPSO.We first evaluate the clustering performance of this model using the variance ratio criterion(VRC)as the evaluation metric.The effectiveness of the CPSO algorithm is compared with that of the traditional particle swarm optimization(PSO)algorithm.The CPSO aims to improve the VRC value while avoiding local optimal solutions.The simulated dataset is set at three levels of overlapping:non-overlapping,partial overlapping,and severe overlapping.Finally,we compare CPSO with two other methods.Results:By observing the comparative results,our proposed CPSO method performs outstandingly.In the conditions of non-overlapping,partial overlapping,and severe overlapping,our method has the best VRC values of 1683.2,620.5,and 275.6,respectively.The mean VRC values in these three cases are 1683.2,617.8,and 222.6.Conclusion:The CPSO performed better than other methods for cluster analysis problems.CPSO is effective for cluster analysis. 展开更多
关键词 cluster analysis chaotic particle swarm optimization variance ratio criterion
下载PDF
A State of Art Analysis of Telecommunication Data by k-Means and k-Medoids Clustering Algorithms
14
作者 T. Velmurugan 《Journal of Computer and Communications》 2018年第1期190-202,共13页
Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-clus... Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-cluster similarity and low inter-cluster similarity. Clustering techniques are applied in different domains to predict future trends of available data and its uses for the real world. This research work is carried out to find the performance of two of the most delegated, partition based clustering algorithms namely k-Means and k-Medoids. A state of art analysis of these two algorithms is implemented and performance is analyzed based on their clustering result quality by means of its execution time and other components. Telecommunication data is the source data for this analysis. The connection oriented broadband data is given as input to find the clustering quality of the algorithms. Distance between the server locations and their connection is considered for clustering. Execution time for each algorithm is analyzed and the results are compared with one another. Results found in comparison study are satisfactory for the chosen application. 展开更多
关键词 k-meanS ALGORITHM k-Medoids ALGORITHM DATA clusterING Time COMPLEXITY TELECOMMUNICATION DATA
下载PDF
Campus Economic Analysis Based on K-Means Clustering and Hotspot Mining
15
作者 Xiuzhang Yang Shuai Wu +2 位作者 Huan Xia Yuanbo Li Xin Li 《Review of Educational Theory》 2020年第2期42-50,共9页
With the advent of the era of big data and the development and construction of smart campuses,the campus is gradually moving towards digitalization,networking and informationization.The campus card is an important par... With the advent of the era of big data and the development and construction of smart campuses,the campus is gradually moving towards digitalization,networking and informationization.The campus card is an important part of the construction of a smart campus,and the massive data it generates can indirectly reflect the living conditions of students at school.In the face of the campus card,how to quickly and accurately obtain the information required by users from the massive data sets has become an urgent problem that needs to be solved.This paper proposes a data mining algorithm based on K-Means clustering and time series.It analyzes the consumption data of a college student’s card to deeply mine and analyze the daily life consumer behavior habits of students,and to make an accurate judgment on the specific life consumer behavior.The algorithm proposed in this paper provides a practical reference for the construction of smart campuses in universities,and has important theoretical and application values. 展开更多
关键词 Machine learning k-means clustering Data mining Consumer behavior Campus economy Economic regionalization
下载PDF
Investigation of the J-TEXT plasma events by k-means clustering algorithm 被引量:1
16
作者 李建超 张晓卿 +11 位作者 张昱 Abba Alhaji BALA 柳惠平 周帼红 王能超 李达 陈忠勇 杨州军 陈志鹏 董蛟龙 丁永华 the J-TEXT Team 《Plasma Science and Technology》 SCIE EI CAS CSCD 2023年第8期38-43,共6页
Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,th... Various types of plasma events emerge in specific parameter ranges and exhibit similar characteristics in diagnostic signals,which can be applied to identify these events.A semisupervised machine learning algorithm,the k-means clustering algorithm,is utilized to investigate and identify plasma events in the J-TEXT plasma.This method can cluster diverse plasma events with homogeneous features,and then these events can be identified if given few manually labeled examples based on physical understanding.A survey of clustered events reveals that the k-means algorithm can make plasma events(rotating tearing mode,sawtooth oscillations,and locked mode)gathering in Euclidean space composed of multi-dimensional diagnostic data,like soft x-ray emission intensity,edge toroidal rotation velocity,the Mirnov signal amplitude and so on.Based on the cluster analysis results,an approximate analytical model is proposed to rapidly identify plasma events in the J-TEXT plasma.The cluster analysis method is conducive to data markers of massive diagnostic data. 展开更多
关键词 k-meanS cluster analysis plasma event machine learning
下载PDF
Analysis of CLARANS Algorithm for Weather Data Based on Spark 被引量:1
17
作者 Jiahao Zhang Honglin Wang 《Computers, Materials & Continua》 SCIE EI 2023年第8期2427-2441,共15页
With the rapid development of technology,processing the explosive growth of meteorological data on traditional standalone computing has become increasingly time-consuming,which cannot meet the demands of scientific re... With the rapid development of technology,processing the explosive growth of meteorological data on traditional standalone computing has become increasingly time-consuming,which cannot meet the demands of scientific research and business.Therefore,this paper proposes the implementation of the parallel Clustering Large Application based upon RANdomized Search(CLARANS)clustering algorithm on the Spark cloud computing platformto cluster China’s climate regions usingmeteorological data from1988 to 2018.The aim is to address the challenge of applying clustering algorithms to large datasets.In this paper,the morphological similarity distance is adopted as the similarity measurement standard instead of Euclidean distance,which improves clustering accuracy.Furthermore,the issue of local optima caused by an improper selection of initial clustering centers is addressed by utilizing the max-distance criterion.Compared to the k-means clustering algorithm already implemented in the Spark platform,the proposed algorithm has strong robustness,can reduce the interference of outliers in the dataset on clustering results,and has higher parallel performance than the frequently used serial algorithms,thus improving the efficiency of big data analysis.This experiment compares the clustered centroid data with the annual average meteorological data of representative cities in the five typical meteorological regions that exist in China,and the results show that the clustering results are in good agreement with the meteorological data obtained from the National Meteorological Science Data Center.This algorithm has a positive effect on the clustering analysis of massive meteorological data and deserves attention in scientific research activities. 展开更多
关键词 clustering analysis cloud computing platform parallel algorithm
下载PDF
Group decision-making method based on entropy and experts cluster analysis 被引量:12
18
作者 Xuan Zhou Fengming Zhang Xiaobin Hui Kewu Li 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2011年第3期468-472,共5页
According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferen... According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferences and the consistency of expert's collating vectors,but they lack of the measure of information similarity.So it may occur that although the collating vector is similar to the group consensus,information uncertainty is great of a certain expert.However,it is clustered to a larger group and given a high weight.For this,a new aggregation method based on entropy and cluster analysis in group decision-making process is provided,in which the collating vectors are classified with information similarity coefficient,and the experts' weights are determined according to the result of classification,the entropy of collating vectors and the judgment matrix consistency.Finally,a numerical example shows that the method is feasible and effective. 展开更多
关键词 group decision-making judgment matrix ENTROPY information similarity coefficient cluster analysis.
下载PDF
Clustering Structure Analysis in Time-Series Data With Density-Based Clusterability Measure 被引量:6
19
作者 Juho Jokinen Tomi Raty Timo Lintonen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2019年第6期1332-1343,共12页
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor... Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data. 展开更多
关键词 clusterING EXPLORATORY data analysis time-series UNSUPERVISED LEARNING
下载PDF
Cluster analysis on summer precipitation field over Qinghai-Tibet Plateau from 1961 to 2004 被引量:7
20
作者 LU Heli SHAO Quanqin +3 位作者 LIU Jiyuan WANG Junbang CHEN Shenbin CHEN Zhuoqi 《Journal of Geographical Sciences》 SCIE CSCD 2008年第3期295-307,共13页
The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation... The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation analysis,regression analysis,empirical orthogonal function,power spectrum function and spatial analysis tools of GIS.The result showed that summer precipitation occupied a relatively high proportion in the area with less annual precipitation on the Plateau and the correlation between summer precipitation and annual precipitation was strong.The altitude of these stations and summer precipitation tendency presented stronger positive correlation below 2000 m,with correlation value up to 0.604(α=0.01).The subtracting tendency values between 1961-1983 and 1984-2004 at five altitude ranges(2000-2500 m,2500-3000 m,3500-4000 m,4000-4500 m and above 4500 m)were above zero and accounted for 71.4%of the total.Using empirical orthogonal function, summer precipitation could be roughly divided into three precipitation pattern fields:the Southeast Plateau Pattern Field,the Northeast Plateau Pattern field and the Three Rivers' Headstream Regions Pattern Field.The former two ones had a reverse value from the north to the south and opposite line was along 35°N.The potential cycles of the three pattern fields were 5.33a,21.33a and 2.17a respectively,tested by the confidence probability of 90%.The station altitudes and summer precipitation potential cycles presented strong negative correlation in the stations above 4500 m,with correlation value of-0.626(α=0.01).In Three Rivers Headstream Regions summer precipitation cycle decreased as the altitude rose in the stations above 3500 m and increased as the altitude rose in those below 3500 m.The empirical orthogonal function analysis in June precipitation,July precipitation and August precipitation showed that the June precipitation pattern field was similar to the July's,in which southern Plateau was positive and northern Plateau negative.But positive value area in July precipitation pattern field was obviously less than June's.The August pattern field was totally opposite to June's and July's.The positive area in August pattern field jumped from the southern Plateau to the northern Plateau. 展开更多
关键词 Qinghai-Tibet Plateau summer precipitation cluster analysis precipitation pattern field precipitation cycle
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部