A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in vari...A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters.展开更多
In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge pr...In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level....In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level. Solving the problem of employment for the people is an important prerequisite for their peaceful living and work, as well as a prerequisite and foundation for building a harmonious society. The employment situation of private enterprises has always been of great concern to the outside world, and these two major jobs have always occupied an important position in the employment field of China that cannot be ignored. With the establishment of the market economy system, individual and private enterprises have become important components of the socialist economy, making significant contributions to economic development and social progress. The rapid development of China’s economy, on the one hand, is the embodiment of the superiority of China’s socialist market economic system, and on the other hand, it is the role of the tertiary industry and private enterprises in promoting the national economy. Since the 1990s, China’s private enterprises have become a new economic growth point for local and even national countries, and are one of the important ways to arrange employment and achieve social stability. This paper studies the employment of private enterprises and individuals from the perspective of statistics, extracts relevant data from China statistical Yearbook, uses the relevant knowledge of statistics to process the data, obtains the conclusion and puts forward relevant constructive suggestions.展开更多
The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine...The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine the number and location of monitoring points according to the actual deformation characteristics of the slope.There are still some defects in the layout of monitoring points.To this end,based on displacement data series and spatial location information of surface displacement monitoring points,by combining displacement series correlation and spatial distance influence factors,a spatial deformation correlation calculation model of slope based on clustering analysis was proposed to calculate the correlation between different monitoring points,based on which the deformation area of the slope was divided.The redundant monitoring points in each partition were eliminated based on the partition's outcome,and the overall optimal arrangement of slope monitoring points was then achieved.This method scientifically addresses the issues of slope deformation zoning and data gathering overlap.It not only eliminates human subjectivity from slope deformation zoning but also increases the efficiency and accuracy of slope monitoring.In order to verify the effectiveness of the method,a sand-mudstone interbedded CounterTilt excavation slope in the Chongqing city of China was used as the research object.Twenty-four monitoring points deployed on this slope were monitored for surface displacement for 13 months.The spatial location of the monitoring points was discussed.The results show that the proposed method of slope deformation zoning and the optimized placement of monitoring points are feasible.展开更多
The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the wester...The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait.展开更多
In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel meth...Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel method based on CPSO.We first evaluate the clustering performance of this model using the variance ratio criterion(VRC)as the evaluation metric.The effectiveness of the CPSO algorithm is compared with that of the traditional particle swarm optimization(PSO)algorithm.The CPSO aims to improve the VRC value while avoiding local optimal solutions.The simulated dataset is set at three levels of overlapping:non-overlapping,partial overlapping,and severe overlapping.Finally,we compare CPSO with two other methods.Results:By observing the comparative results,our proposed CPSO method performs outstandingly.In the conditions of non-overlapping,partial overlapping,and severe overlapping,our method has the best VRC values of 1683.2,620.5,and 275.6,respectively.The mean VRC values in these three cases are 1683.2,617.8,and 222.6.Conclusion:The CPSO performed better than other methods for cluster analysis problems.CPSO is effective for cluster analysis.展开更多
With the rapid development of technology,processing the explosive growth of meteorological data on traditional standalone computing has become increasingly time-consuming,which cannot meet the demands of scientific re...With the rapid development of technology,processing the explosive growth of meteorological data on traditional standalone computing has become increasingly time-consuming,which cannot meet the demands of scientific research and business.Therefore,this paper proposes the implementation of the parallel Clustering Large Application based upon RANdomized Search(CLARANS)clustering algorithm on the Spark cloud computing platformto cluster China’s climate regions usingmeteorological data from1988 to 2018.The aim is to address the challenge of applying clustering algorithms to large datasets.In this paper,the morphological similarity distance is adopted as the similarity measurement standard instead of Euclidean distance,which improves clustering accuracy.Furthermore,the issue of local optima caused by an improper selection of initial clustering centers is addressed by utilizing the max-distance criterion.Compared to the k-means clustering algorithm already implemented in the Spark platform,the proposed algorithm has strong robustness,can reduce the interference of outliers in the dataset on clustering results,and has higher parallel performance than the frequently used serial algorithms,thus improving the efficiency of big data analysis.This experiment compares the clustered centroid data with the annual average meteorological data of representative cities in the five typical meteorological regions that exist in China,and the results show that the clustering results are in good agreement with the meteorological data obtained from the National Meteorological Science Data Center.This algorithm has a positive effect on the clustering analysis of massive meteorological data and deserves attention in scientific research activities.展开更多
According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferen...According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferences and the consistency of expert's collating vectors,but they lack of the measure of information similarity.So it may occur that although the collating vector is similar to the group consensus,information uncertainty is great of a certain expert.However,it is clustered to a larger group and given a high weight.For this,a new aggregation method based on entropy and cluster analysis in group decision-making process is provided,in which the collating vectors are classified with information similarity coefficient,and the experts' weights are determined according to the result of classification,the entropy of collating vectors and the judgment matrix consistency.Finally,a numerical example shows that the method is feasible and effective.展开更多
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation...The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation analysis,regression analysis,empirical orthogonal function,power spectrum function and spatial analysis tools of GIS.The result showed that summer precipitation occupied a relatively high proportion in the area with less annual precipitation on the Plateau and the correlation between summer precipitation and annual precipitation was strong.The altitude of these stations and summer precipitation tendency presented stronger positive correlation below 2000 m,with correlation value up to 0.604(α=0.01).The subtracting tendency values between 1961-1983 and 1984-2004 at five altitude ranges(2000-2500 m,2500-3000 m,3500-4000 m,4000-4500 m and above 4500 m)were above zero and accounted for 71.4%of the total.Using empirical orthogonal function, summer precipitation could be roughly divided into three precipitation pattern fields:the Southeast Plateau Pattern Field,the Northeast Plateau Pattern field and the Three Rivers' Headstream Regions Pattern Field.The former two ones had a reverse value from the north to the south and opposite line was along 35°N.The potential cycles of the three pattern fields were 5.33a,21.33a and 2.17a respectively,tested by the confidence probability of 90%.The station altitudes and summer precipitation potential cycles presented strong negative correlation in the stations above 4500 m,with correlation value of-0.626(α=0.01).In Three Rivers Headstream Regions summer precipitation cycle decreased as the altitude rose in the stations above 3500 m and increased as the altitude rose in those below 3500 m.The empirical orthogonal function analysis in June precipitation,July precipitation and August precipitation showed that the June precipitation pattern field was similar to the July's,in which southern Plateau was positive and northern Plateau negative.But positive value area in July precipitation pattern field was obviously less than June's.The August pattern field was totally opposite to June's and July's.The positive area in August pattern field jumped from the southern Plateau to the northern Plateau.展开更多
The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ...The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.展开更多
By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corre- sponding normal paraffin hydrocarbon ( including pristane and phytane) concentrati...By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corre- sponding normal paraffin hydrocarbon ( including pristane and phytane) concentration was obtained by the internal standard methed. The normal paraffin hydrocarbon distribution patterns of six crude oils were built and compared. The cluster analysis on the normal paraffin hydrocarbon concentration was conducted for classification and some ratios of oils were used for oils comparison. The results indicated: there was a clear difference within different crude oils in different oil fields and a small difference between the crude oils in the same oil platform. The normal paraffin hydrocarbon distribution pattern and ratios, as well as the cluster analysis on the nomad paraffin hydrocarbon concentration can have a better differentiation result for the crude oils with small difference than the original gas chromatogram.展开更多
[Objectives] This study aimed to establish HPLC fingerprint and conduct cluster analysis and principle component analysis for Citri Reticulatae Pericarpium Viride. [Methods] Using the HPLC method, the determination wa...[Objectives] This study aimed to establish HPLC fingerprint and conduct cluster analysis and principle component analysis for Citri Reticulatae Pericarpium Viride. [Methods] Using the HPLC method, the determination was performed on XSelect~® HSS T3-C_(18) column with mobile phase of acetonitrile-0.5% acetic acid solution(gradient elution) at the flow rate of 1.0 mL/min. The detection wavelength was 360 nm. The column temperature was 25℃. The sample size was 10 μL. With peak of hesperidin as the reference, HPLC fingerprints of 10 batches of Citri Reticulatae Pericarpium Viride were determined. The similarity of the 10 batches of samples was evaluated by Similarity Evaluation System for Chromatographic Fingerprint of TCM(2012 edition) to determine the common peaks. Cluster analysis and principal component analysis were performed by using SPSS 17.0 statistical software. [Results] The HPLC fingerprints of the 10 batches of medicinal materials had total 11 common peaks, and the similarity was 0.919-1.000, indicating that the chemical composition of the 10 batches of medicinal materials was consistent. There were 11 common components in the 10 batches of medicinal materials, but their contents were different. When the Euclidean distance was 20, the 10 batches of samples were divided into two categories, S4 in the first category, and the others in the second one. When the Euclidean distance was 5, the second category could be further divided into two sub-categories, S1 and S10 in one sub-category, and S2, S3, S5, S6, S7, S8 and S9 in the other one. The principle component analysis showed that cumulative contribution rate of the two main component factors was 92.797%, and the comprehensive score of S7 was the highest with the best quality. [Conclusions] The results of HPLC fingerprinting, cluster analysis and principle component analysis can provide reference for the quality control of Citri Reticulatae Pericarpium Viride.展开更多
Diversity of 60 conventional japonica rice accessions with good eating quality at home and abroad was analyzed using SSR molecular markers, agronomic traits and taste characteristics. A total of 290 alleles were detec...Diversity of 60 conventional japonica rice accessions with good eating quality at home and abroad was analyzed using SSR molecular markers, agronomic traits and taste characteristics. A total of 290 alleles were detected in the 60 accessions at 72 SSR loci with the high similarity coefficients varying between 0.600 and 0.924. The loci on chromosome 5 showed the greatest value in average allele number. Additionally, most of the SSR loci could detect 3 to 4 alleles. An UPGMA dendrogram based on the cluster analysis of the genetic similarity coefficients showed that the grouping trend of part of the rice accessions was geographic-related and most of the rice accessions in Jiangsu Province, China were clustered together. Furthermore, many domestic accessions from south and north origins in China were close to the foreign japonica rice varieties, as proved by their pedigree origin from the foreign high-quality sources. For taste characteristics, part of the accessions with excellent taste were clearly clustered into one category though they came from different geographical regions, which indicates that taste characteristics of some varieties were mainly genetically determined. In addition, the agronomic traits of japonica rice with good taste might be closely related with their geographical origins, but the relationship between superior taste characteristics and agronomic traits should be further clarified.展开更多
Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to ass...Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to assign principal component factor based on cluster analysis of loading matrix and combining with actual meaning and evaluation direction of index categories. To evaluate the richness of its nutrition according to the score of nutrition of fruit and vegetable, finally equivalent replacement suggestions are given in different seasons of vegetables & fruits according to the result of clustering. Studies show that principal component cluster method can not only carry on the reasonable classification of multivariate data effectively, but also make reasonable evaluation on the sample object, and provide powerful basis for evaluation of fruits and vegetables’ nutrition.展开更多
For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results...For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results showed that the soil mesofauna tended to gather on soil surface in most samples at most times, but the vertical migrating greatly varied in different seasons or environment conditions. Acari was the dominant group. The index of diversity of the soil fauna was correlated with the index of evenness. The Acari's number of individuals infected other species and numbers. Dominant group-Aeari made greater contribution to the result of cluster analysis, and there were significant differences between communities in different habitats by cluster analysis with both Bray-Curtis and Jaccard similarity coefficient.展开更多
The genetic diversity of 41 parental lines popularized in commercial hybrid rice production in China was studied by using cluster analysis of morphological traits and simple sequence repeat (SSR) markers. Forty-one ...The genetic diversity of 41 parental lines popularized in commercial hybrid rice production in China was studied by using cluster analysis of morphological traits and simple sequence repeat (SSR) markers. Forty-one entries were assigned into two clusters (i.e. early or medium-maturing cluster; medium or late-maturing cluster) and further assigned into six sub-clusters based on morphological trait cluster analysis, The early or medium-maturing cluster was composed of 15 maintainer lines, four early-maturing restorer lines and two thermo-sensitive genic male sterile lines, and the medium or late-maturing cluster included 16 restorer lines and 4 medium or late-maturing maintainer lines. Moreover, the SSR cluster analysis classified 41 entries into two groups (i.e, maintainer line group and restorer line group) and seven sub-groups. The maintainer line group consisted of all 19 maintainer lines, two thermo-sensitive genic male sterile lines, while the restorer line group was composed of all 20 restorer lines. The SSR analysis fitted better with the pedigree information. From the views on hybrid rice breeding, the results suggested that SSR analysis might be a better method to study the diversity of parental lines in indica hybrid rice.展开更多
Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi c...Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.展开更多
文摘A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters.
文摘In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
文摘In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level. Solving the problem of employment for the people is an important prerequisite for their peaceful living and work, as well as a prerequisite and foundation for building a harmonious society. The employment situation of private enterprises has always been of great concern to the outside world, and these two major jobs have always occupied an important position in the employment field of China that cannot be ignored. With the establishment of the market economy system, individual and private enterprises have become important components of the socialist economy, making significant contributions to economic development and social progress. The rapid development of China’s economy, on the one hand, is the embodiment of the superiority of China’s socialist market economic system, and on the other hand, it is the role of the tertiary industry and private enterprises in promoting the national economy. Since the 1990s, China’s private enterprises have become a new economic growth point for local and even national countries, and are one of the important ways to arrange employment and achieve social stability. This paper studies the employment of private enterprises and individuals from the perspective of statistics, extracts relevant data from China statistical Yearbook, uses the relevant knowledge of statistics to process the data, obtains the conclusion and puts forward relevant constructive suggestions.
基金funding from the National Natural Science Foundation of China(No.41572308)。
文摘The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine the number and location of monitoring points according to the actual deformation characteristics of the slope.There are still some defects in the layout of monitoring points.To this end,based on displacement data series and spatial location information of surface displacement monitoring points,by combining displacement series correlation and spatial distance influence factors,a spatial deformation correlation calculation model of slope based on clustering analysis was proposed to calculate the correlation between different monitoring points,based on which the deformation area of the slope was divided.The redundant monitoring points in each partition were eliminated based on the partition's outcome,and the overall optimal arrangement of slope monitoring points was then achieved.This method scientifically addresses the issues of slope deformation zoning and data gathering overlap.It not only eliminates human subjectivity from slope deformation zoning but also increases the efficiency and accuracy of slope monitoring.In order to verify the effectiveness of the method,a sand-mudstone interbedded CounterTilt excavation slope in the Chongqing city of China was used as the research object.Twenty-four monitoring points deployed on this slope were monitored for surface displacement for 13 months.The spatial location of the monitoring points was discussed.The results show that the proposed method of slope deformation zoning and the optimized placement of monitoring points are feasible.
基金The National Natural Science Foundation of China under contract Nos 42106005,91958203,41676131,41876155.
文摘The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait.
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
文摘Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel method based on CPSO.We first evaluate the clustering performance of this model using the variance ratio criterion(VRC)as the evaluation metric.The effectiveness of the CPSO algorithm is compared with that of the traditional particle swarm optimization(PSO)algorithm.The CPSO aims to improve the VRC value while avoiding local optimal solutions.The simulated dataset is set at three levels of overlapping:non-overlapping,partial overlapping,and severe overlapping.Finally,we compare CPSO with two other methods.Results:By observing the comparative results,our proposed CPSO method performs outstandingly.In the conditions of non-overlapping,partial overlapping,and severe overlapping,our method has the best VRC values of 1683.2,620.5,and 275.6,respectively.The mean VRC values in these three cases are 1683.2,617.8,and 222.6.Conclusion:The CPSO performed better than other methods for cluster analysis problems.CPSO is effective for cluster analysis.
基金supported by the National Natural Science Foundation of China(Grant No.62101275 and 62101274).
文摘With the rapid development of technology,processing the explosive growth of meteorological data on traditional standalone computing has become increasingly time-consuming,which cannot meet the demands of scientific research and business.Therefore,this paper proposes the implementation of the parallel Clustering Large Application based upon RANdomized Search(CLARANS)clustering algorithm on the Spark cloud computing platformto cluster China’s climate regions usingmeteorological data from1988 to 2018.The aim is to address the challenge of applying clustering algorithms to large datasets.In this paper,the morphological similarity distance is adopted as the similarity measurement standard instead of Euclidean distance,which improves clustering accuracy.Furthermore,the issue of local optima caused by an improper selection of initial clustering centers is addressed by utilizing the max-distance criterion.Compared to the k-means clustering algorithm already implemented in the Spark platform,the proposed algorithm has strong robustness,can reduce the interference of outliers in the dataset on clustering results,and has higher parallel performance than the frequently used serial algorithms,thus improving the efficiency of big data analysis.This experiment compares the clustered centroid data with the annual average meteorological data of representative cities in the five typical meteorological regions that exist in China,and the results show that the clustering results are in good agreement with the meteorological data obtained from the National Meteorological Science Data Center.This algorithm has a positive effect on the clustering analysis of massive meteorological data and deserves attention in scientific research activities.
文摘According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferences and the consistency of expert's collating vectors,but they lack of the measure of information similarity.So it may occur that although the collating vector is similar to the group consensus,information uncertainty is great of a certain expert.However,it is clustered to a larger group and given a high weight.For this,a new aggregation method based on entropy and cluster analysis in group decision-making process is provided,in which the collating vectors are classified with information similarity coefficient,and the experts' weights are determined according to the result of classification,the entropy of collating vectors and the judgment matrix consistency.Finally,a numerical example shows that the method is feasible and effective.
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金CAS Action-plan for West Development, KZCX2-XB2-06-03 National Natural Science Foundation of China, No.30500064
文摘The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation analysis,regression analysis,empirical orthogonal function,power spectrum function and spatial analysis tools of GIS.The result showed that summer precipitation occupied a relatively high proportion in the area with less annual precipitation on the Plateau and the correlation between summer precipitation and annual precipitation was strong.The altitude of these stations and summer precipitation tendency presented stronger positive correlation below 2000 m,with correlation value up to 0.604(α=0.01).The subtracting tendency values between 1961-1983 and 1984-2004 at five altitude ranges(2000-2500 m,2500-3000 m,3500-4000 m,4000-4500 m and above 4500 m)were above zero and accounted for 71.4%of the total.Using empirical orthogonal function, summer precipitation could be roughly divided into three precipitation pattern fields:the Southeast Plateau Pattern Field,the Northeast Plateau Pattern field and the Three Rivers' Headstream Regions Pattern Field.The former two ones had a reverse value from the north to the south and opposite line was along 35°N.The potential cycles of the three pattern fields were 5.33a,21.33a and 2.17a respectively,tested by the confidence probability of 90%.The station altitudes and summer precipitation potential cycles presented strong negative correlation in the stations above 4500 m,with correlation value of-0.626(α=0.01).In Three Rivers Headstream Regions summer precipitation cycle decreased as the altitude rose in the stations above 3500 m and increased as the altitude rose in those below 3500 m.The empirical orthogonal function analysis in June precipitation,July precipitation and August precipitation showed that the June precipitation pattern field was similar to the July's,in which southern Plateau was positive and northern Plateau negative.But positive value area in July precipitation pattern field was obviously less than June's.The August pattern field was totally opposite to June's and July's.The positive area in August pattern field jumped from the southern Plateau to the northern Plateau.
文摘The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.
基金the National Natural Science Foundation of China under contract No.49976027 the Important Topic of Scientific Research of the State 0ceanic Administration, China, on the construction system of oil fingerprinting database and the key technology (from 2004 to 2005 ).
文摘By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corre- sponding normal paraffin hydrocarbon ( including pristane and phytane) concentration was obtained by the internal standard methed. The normal paraffin hydrocarbon distribution patterns of six crude oils were built and compared. The cluster analysis on the normal paraffin hydrocarbon concentration was conducted for classification and some ratios of oils were used for oils comparison. The results indicated: there was a clear difference within different crude oils in different oil fields and a small difference between the crude oils in the same oil platform. The normal paraffin hydrocarbon distribution pattern and ratios, as well as the cluster analysis on the nomad paraffin hydrocarbon concentration can have a better differentiation result for the crude oils with small difference than the original gas chromatogram.
基金Supported by National Natural Science Foundation of China(81603251)Key Research and Development Plan of Shanxi Province(201603D3113021)Project of Collaborative Innovation Center for the Comprehensive Development and Utilization of Medicinal Herbs in Shanxi Province(2017-JYXT-05)
文摘[Objectives] This study aimed to establish HPLC fingerprint and conduct cluster analysis and principle component analysis for Citri Reticulatae Pericarpium Viride. [Methods] Using the HPLC method, the determination was performed on XSelect~® HSS T3-C_(18) column with mobile phase of acetonitrile-0.5% acetic acid solution(gradient elution) at the flow rate of 1.0 mL/min. The detection wavelength was 360 nm. The column temperature was 25℃. The sample size was 10 μL. With peak of hesperidin as the reference, HPLC fingerprints of 10 batches of Citri Reticulatae Pericarpium Viride were determined. The similarity of the 10 batches of samples was evaluated by Similarity Evaluation System for Chromatographic Fingerprint of TCM(2012 edition) to determine the common peaks. Cluster analysis and principal component analysis were performed by using SPSS 17.0 statistical software. [Results] The HPLC fingerprints of the 10 batches of medicinal materials had total 11 common peaks, and the similarity was 0.919-1.000, indicating that the chemical composition of the 10 batches of medicinal materials was consistent. There were 11 common components in the 10 batches of medicinal materials, but their contents were different. When the Euclidean distance was 20, the 10 batches of samples were divided into two categories, S4 in the first category, and the others in the second one. When the Euclidean distance was 5, the second category could be further divided into two sub-categories, S1 and S10 in one sub-category, and S2, S3, S5, S6, S7, S8 and S9 in the other one. The principle component analysis showed that cumulative contribution rate of the two main component factors was 92.797%, and the comprehensive score of S7 was the highest with the best quality. [Conclusions] The results of HPLC fingerprinting, cluster analysis and principle component analysis can provide reference for the quality control of Citri Reticulatae Pericarpium Viride.
基金supported by the National Science and Technology Support Program(Grant No.2006BAD01A01-5)the Key Program of the Development of Variety of Genetically Modified Organisms(Grant No.2008ZX08001-006)+2 种基金Special Program for Rice Scientific Research,Ministry of Agriculture,China(Grant No.nyhyzx 07-001-006)the Key Support Program of Jiangsu Science and Technology(Grant No.BE2008354)Jiangsu Self-innovation Fund for Agricultural Science and Technology,China(GrantNo.CX[08]603)
文摘Diversity of 60 conventional japonica rice accessions with good eating quality at home and abroad was analyzed using SSR molecular markers, agronomic traits and taste characteristics. A total of 290 alleles were detected in the 60 accessions at 72 SSR loci with the high similarity coefficients varying between 0.600 and 0.924. The loci on chromosome 5 showed the greatest value in average allele number. Additionally, most of the SSR loci could detect 3 to 4 alleles. An UPGMA dendrogram based on the cluster analysis of the genetic similarity coefficients showed that the grouping trend of part of the rice accessions was geographic-related and most of the rice accessions in Jiangsu Province, China were clustered together. Furthermore, many domestic accessions from south and north origins in China were close to the foreign japonica rice varieties, as proved by their pedigree origin from the foreign high-quality sources. For taste characteristics, part of the accessions with excellent taste were clearly clustered into one category though they came from different geographical regions, which indicates that taste characteristics of some varieties were mainly genetically determined. In addition, the agronomic traits of japonica rice with good taste might be closely related with their geographical origins, but the relationship between superior taste characteristics and agronomic traits should be further clarified.
文摘Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to assign principal component factor based on cluster analysis of loading matrix and combining with actual meaning and evaluation direction of index categories. To evaluate the richness of its nutrition according to the score of nutrition of fruit and vegetable, finally equivalent replacement suggestions are given in different seasons of vegetables & fruits according to the result of clustering. Studies show that principal component cluster method can not only carry on the reasonable classification of multivariate data effectively, but also make reasonable evaluation on the sample object, and provide powerful basis for evaluation of fruits and vegetables’ nutrition.
基金Supported by the Doctoral Fund of Northeast Agricultural University(2009RC41)Postdoctoral Grants of Heilongjiang Province(LBH-Z10265)
文摘For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results showed that the soil mesofauna tended to gather on soil surface in most samples at most times, but the vertical migrating greatly varied in different seasons or environment conditions. Acari was the dominant group. The index of diversity of the soil fauna was correlated with the index of evenness. The Acari's number of individuals infected other species and numbers. Dominant group-Aeari made greater contribution to the result of cluster analysis, and there were significant differences between communities in different habitats by cluster analysis with both Bray-Curtis and Jaccard similarity coefficient.
文摘The genetic diversity of 41 parental lines popularized in commercial hybrid rice production in China was studied by using cluster analysis of morphological traits and simple sequence repeat (SSR) markers. Forty-one entries were assigned into two clusters (i.e. early or medium-maturing cluster; medium or late-maturing cluster) and further assigned into six sub-clusters based on morphological trait cluster analysis, The early or medium-maturing cluster was composed of 15 maintainer lines, four early-maturing restorer lines and two thermo-sensitive genic male sterile lines, and the medium or late-maturing cluster included 16 restorer lines and 4 medium or late-maturing maintainer lines. Moreover, the SSR cluster analysis classified 41 entries into two groups (i.e, maintainer line group and restorer line group) and seven sub-groups. The maintainer line group consisted of all 19 maintainer lines, two thermo-sensitive genic male sterile lines, while the restorer line group was composed of all 20 restorer lines. The SSR analysis fitted better with the pedigree information. From the views on hybrid rice breeding, the results suggested that SSR analysis might be a better method to study the diversity of parental lines in indica hybrid rice.
基金This work was supported by National Natural Science Foundation of China(Nos.U1562218,41604107,and 41804126).
文摘Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.