A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in vari...A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters.展开更多
The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the wester...The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait.展开更多
Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel meth...Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel method based on CPSO.We first evaluate the clustering performance of this model using the variance ratio criterion(VRC)as the evaluation metric.The effectiveness of the CPSO algorithm is compared with that of the traditional particle swarm optimization(PSO)algorithm.The CPSO aims to improve the VRC value while avoiding local optimal solutions.The simulated dataset is set at three levels of overlapping:non-overlapping,partial overlapping,and severe overlapping.Finally,we compare CPSO with two other methods.Results:By observing the comparative results,our proposed CPSO method performs outstandingly.In the conditions of non-overlapping,partial overlapping,and severe overlapping,our method has the best VRC values of 1683.2,620.5,and 275.6,respectively.The mean VRC values in these three cases are 1683.2,617.8,and 222.6.Conclusion:The CPSO performed better than other methods for cluster analysis problems.CPSO is effective for cluster analysis.展开更多
The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allo...The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allows clustering variable objects into groups-clusters on the basis of similarity or dissimilarity. Cluster analysis involves computational procedures, of which purpose is to reduce a set of data on several relatively homogenous groups-clusters, while the condition of reduction is maximal and simultaneously minimal similarity of clusters. Similarity of objects is studied by the degree of similarity (correlation coefficient and association coefficient) or the degree of dissimilarity-degree of distance (distance coefficient). Methods of cluster analysis are on the basis of clustering classified as hierarchical or non-hierarchical methods.展开更多
The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine...The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine the number and location of monitoring points according to the actual deformation characteristics of the slope.There are still some defects in the layout of monitoring points.To this end,based on displacement data series and spatial location information of surface displacement monitoring points,by combining displacement series correlation and spatial distance influence factors,a spatial deformation correlation calculation model of slope based on clustering analysis was proposed to calculate the correlation between different monitoring points,based on which the deformation area of the slope was divided.The redundant monitoring points in each partition were eliminated based on the partition's outcome,and the overall optimal arrangement of slope monitoring points was then achieved.This method scientifically addresses the issues of slope deformation zoning and data gathering overlap.It not only eliminates human subjectivity from slope deformation zoning but also increases the efficiency and accuracy of slope monitoring.In order to verify the effectiveness of the method,a sand-mudstone interbedded CounterTilt excavation slope in the Chongqing city of China was used as the research object.Twenty-four monitoring points deployed on this slope were monitored for surface displacement for 13 months.The spatial location of the monitoring points was discussed.The results show that the proposed method of slope deformation zoning and the optimized placement of monitoring points are feasible.展开更多
According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferen...According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferences and the consistency of expert's collating vectors,but they lack of the measure of information similarity.So it may occur that although the collating vector is similar to the group consensus,information uncertainty is great of a certain expert.However,it is clustered to a larger group and given a high weight.For this,a new aggregation method based on entropy and cluster analysis in group decision-making process is provided,in which the collating vectors are classified with information similarity coefficient,and the experts' weights are determined according to the result of classification,the entropy of collating vectors and the judgment matrix consistency.Finally,a numerical example shows that the method is feasible and effective.展开更多
The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation...The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation analysis,regression analysis,empirical orthogonal function,power spectrum function and spatial analysis tools of GIS.The result showed that summer precipitation occupied a relatively high proportion in the area with less annual precipitation on the Plateau and the correlation between summer precipitation and annual precipitation was strong.The altitude of these stations and summer precipitation tendency presented stronger positive correlation below 2000 m,with correlation value up to 0.604(α=0.01).The subtracting tendency values between 1961-1983 and 1984-2004 at five altitude ranges(2000-2500 m,2500-3000 m,3500-4000 m,4000-4500 m and above 4500 m)were above zero and accounted for 71.4%of the total.Using empirical orthogonal function, summer precipitation could be roughly divided into three precipitation pattern fields:the Southeast Plateau Pattern Field,the Northeast Plateau Pattern field and the Three Rivers' Headstream Regions Pattern Field.The former two ones had a reverse value from the north to the south and opposite line was along 35°N.The potential cycles of the three pattern fields were 5.33a,21.33a and 2.17a respectively,tested by the confidence probability of 90%.The station altitudes and summer precipitation potential cycles presented strong negative correlation in the stations above 4500 m,with correlation value of-0.626(α=0.01).In Three Rivers Headstream Regions summer precipitation cycle decreased as the altitude rose in the stations above 3500 m and increased as the altitude rose in those below 3500 m.The empirical orthogonal function analysis in June precipitation,July precipitation and August precipitation showed that the June precipitation pattern field was similar to the July's,in which southern Plateau was positive and northern Plateau negative.But positive value area in July precipitation pattern field was obviously less than June's.The August pattern field was totally opposite to June's and July's.The positive area in August pattern field jumped from the southern Plateau to the northern Plateau.展开更多
Diversity of 60 conventional japonica rice accessions with good eating quality at home and abroad was analyzed using SSR molecular markers, agronomic traits and taste characteristics. A total of 290 alleles were detec...Diversity of 60 conventional japonica rice accessions with good eating quality at home and abroad was analyzed using SSR molecular markers, agronomic traits and taste characteristics. A total of 290 alleles were detected in the 60 accessions at 72 SSR loci with the high similarity coefficients varying between 0.600 and 0.924. The loci on chromosome 5 showed the greatest value in average allele number. Additionally, most of the SSR loci could detect 3 to 4 alleles. An UPGMA dendrogram based on the cluster analysis of the genetic similarity coefficients showed that the grouping trend of part of the rice accessions was geographic-related and most of the rice accessions in Jiangsu Province, China were clustered together. Furthermore, many domestic accessions from south and north origins in China were close to the foreign japonica rice varieties, as proved by their pedigree origin from the foreign high-quality sources. For taste characteristics, part of the accessions with excellent taste were clearly clustered into one category though they came from different geographical regions, which indicates that taste characteristics of some varieties were mainly genetically determined. In addition, the agronomic traits of japonica rice with good taste might be closely related with their geographical origins, but the relationship between superior taste characteristics and agronomic traits should be further clarified.展开更多
By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corre- sponding normal paraffin hydrocarbon ( including pristane and phytane) concentrati...By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corre- sponding normal paraffin hydrocarbon ( including pristane and phytane) concentration was obtained by the internal standard methed. The normal paraffin hydrocarbon distribution patterns of six crude oils were built and compared. The cluster analysis on the normal paraffin hydrocarbon concentration was conducted for classification and some ratios of oils were used for oils comparison. The results indicated: there was a clear difference within different crude oils in different oil fields and a small difference between the crude oils in the same oil platform. The normal paraffin hydrocarbon distribution pattern and ratios, as well as the cluster analysis on the nomad paraffin hydrocarbon concentration can have a better differentiation result for the crude oils with small difference than the original gas chromatogram.展开更多
For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results...For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results showed that the soil mesofauna tended to gather on soil surface in most samples at most times, but the vertical migrating greatly varied in different seasons or environment conditions. Acari was the dominant group. The index of diversity of the soil fauna was correlated with the index of evenness. The Acari's number of individuals infected other species and numbers. Dominant group-Aeari made greater contribution to the result of cluster analysis, and there were significant differences between communities in different habitats by cluster analysis with both Bray-Curtis and Jaccard similarity coefficient.展开更多
Aim Cluster analysis was conducted on data from 5,169 United States (U.S.) Arizona children, age's 5-59-months with the goal of delineating patterns of caries in the primary dentition of pre-school children without...Aim Cluster analysis was conducted on data from 5,169 United States (U.S.) Arizona children, age's 5-59-months with the goal of delineating patterns of caries in the primary dentition of pre-school children without a priori pattern definitions. Methodology Cluster analyses were conducted using all data for children ages 0-4 years in aggregate: 1) for all subjects, and 2) for subjects without crowned restored teeth. Each of these two sets of analyses consisted of 8 differently specified cluster analyses as a validation procedure. Results The caries patterns identified from the clustering analysis are: 1) smooth surfaces (other than the maxillary incisor), 2) maxillary incisor, 3) occlusal surfaces of first molars, and 4) pit and fissure surfaces of second molars. Conclusion The cluster analysis findings were consistent with results produced by multidimensional scaling. These cross-validated patterns may represent resulting disease conditions from different risks or the timing of various risk factor exposures. As such, the patterns may be useful case definitions for caries risk factor investigations in children under 60 months of age.展开更多
Cupressinocladus Seward is a fossil genus of conifers and conifer fossils with reproductive organs are very rare. In general, it is difficult to understand the natural affinities with other conifers. In this paper, a ...Cupressinocladus Seward is a fossil genus of conifers and conifer fossils with reproductive organs are very rare. In general, it is difficult to understand the natural affinities with other conifers. In this paper, a new species, Cupressinocladus guyangensis P.H. Jin et B.N. Sun sp. nov., is reported based on branches with immature female cones from the Lower Cretaceous Guyang Formation of the Guyang Basin in Inner Mongolia, northern China. The foliage shoots are decussate. Leaves are decussate, imbricate, scale-like, weakly dimorphic, and bear longitudinal glands on the abaxial view. Stomata complexes are haplocheilic, monocyclic, irregularly arranged, and spread along the leaf margin. Immature female cones are subglobose with 6-8 cone scales, and three subglobose ovules arranged in a row at the base of the cone scales. Moreover, we performed cluster analysis using a statistics and machine learning toolbox for 23 fossils and extant species based on 16 morphological characters. The result implies that the new species bears a close resemblance to the extant Cupressusfunebris Endl. and might have nearest systematic affinities to it.展开更多
Cluster analysis in spectroscopy presents some unique challenges due to the specific data characteristics in spectroscopy,namely,high dimensionality and small sample size.In order to improve cluster analysis outcomes,...Cluster analysis in spectroscopy presents some unique challenges due to the specific data characteristics in spectroscopy,namely,high dimensionality and small sample size.In order to improve cluster analysis outcomes,feature selection can be used to remove redundant or irrelevant features and reduce the dimensionality.However,for cluster analysis,this must be done in an unsupervised manner without the benefit of data labels.This paper presents a novel feature selection approach for cluster analysis,utilizing clusterability metrics to remove features that least contribute to a dataset’s tendency to cluster.Two versions are presented and evaluated:The Hopkins clusterability filter which utilizes the Hopkins test for spatial randomness and the Dip clusterability filter which utilizes the Dip test for unimodality.These new techniques,along with a range of existing filter and wrapper feature selection techniques were evaluated on eleven real-world spectroscopy datasets using internal and external clustering indices.Our newly proposed Hopkins clusterability filter performed the best of the six filter techniques evaluated.However,it was observed that results varied greatly for different techniques depending on the specifics of the dataset and the number of features selected,with significant instability observed for most techniques at low numbers of features.It was identified that the genetic algorithm wrapper technique avoided this instability,performed consistently across all datasets and resulted in better results on average than utilizing the all the features in the spectra.展开更多
Supervised machine learning techniques have become well established in the study of spectroscopy data.However,the unsupervised learning technique of cluster analysis hasn’t reached the same level maturity in chemomet...Supervised machine learning techniques have become well established in the study of spectroscopy data.However,the unsupervised learning technique of cluster analysis hasn’t reached the same level maturity in chemometric analysis.This paper surveys recent studies which apply cluster analysis to NIR and IR spectroscopy data.In addition,we summarize the current practices in cluster analysis of spectroscopy and contrast these with cluster analysis literature from the machine learning and pattern recognition domain.This includes practices in data pre-processing,feature extraction,clustering distance metrics,clustering algorithms and validation techniques.Special consideration is given to the specific characteristics of IR and NIR spectroscopy data which typically includes high dimensionality and relatively low sample size.The findings highlighted a lack of quantitative analysis and evaluation in current practices for cluster analysis of IR and NIR spectroscopy data.With this in mind,we propose an analysis model or workflow with techniques specifically suited for cluster analysis of IR and NIR spectroscopy data along with a pragmatic application strategy.展开更多
The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigat...The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigated. Results showed that increased nitrogen rates, water regimes and population densities and decreased seedling ages could enhance reflectance at NIR (near infrared) bands and reduce reflectance at visible bands. Using reflectance of green, red and NIR band and ratio index of 810-560 nm could distinguish the different type of rice by fuzzy cluster analysis,展开更多
Correlation and path coefficient analyses were conducted for 10 characteristics of 24 pure lines of flue-cured tobacco such as plant height, knot distance, leaf number, the central leaf length and width, ratio of the ...Correlation and path coefficient analyses were conducted for 10 characteristics of 24 pure lines of flue-cured tobacco such as plant height, knot distance, leaf number, the central leaf length and width, ratio of the length to width, stem girth, dates of budding, leaf yield and ratio of the prime-medium tobacco. The leaf number and the central leaf length showed a positive or a strong positive correlation with the yield per plant. And the leaf number and leaf yield per plant showed a strong positive correlation with the ratio of prime-medium tobacco. The results showed that the leaf yield per plant among these characteristics played a major role in determining the ratio of prime-medium tobacco while the others were less related with the ratio. Square sum of deviation method cluster analyses showed that 24 pure lines of flue-cured tobacco were clustered into two groups. Of the pure lines, Line T1706 and Line T1245 had a far relationship with all other lines, and also had a heterosis when crossed with the other lines. Lines Guangdonghuang 1 and R72(3)B-2-1 were closely related.展开更多
Objective:To explore the general differentiation and treatment of insomnia by Professor Gao Ying through drug clustering and group correspondence analysis,and provide reference for clinical diagnosis and treatment.Met...Objective:To explore the general differentiation and treatment of insomnia by Professor Gao Ying through drug clustering and group correspondence analysis,and provide reference for clinical diagnosis and treatment.Methods:Collect retrospective case data from outpatient system,use SPSS20.0 software to perform frequency and cluster analysis on high-frequency symptoms and drug data,and perform corresponding analysis on the clustered drug syndrome groups.Results:A total of 349 consultations in 204 patients were included.Cluster analysis of 35 symptoms and 40 flavors with a frequency of more than 10%resulted in a corresponding relationship between 7 symptom groups,6 drug groups and 5 drug syndrome groups.The medicine symptom group has a high degree of matching;the doctors distinguish and tre at insomnia with calming,clearing heat,nourishing yin,liver,spleen,qi and phlegm as the core treatment,with consistent decoction,two to pill,lily ground Huang Tang,Lily Zhimu Decoction,Wendan Decoction,Sini San,Xiao Chai Hu Tang,Xiaoyao San,etc.are commonly used prescriptions;the physician's experience is to add or subtract Danshen and Zao Ren drink,which has a wide range of applicability to various insomnia syndrome.Conclusion:Based on the cluster analysis of drug symptoms and group correspondence analysis,it can reveal the pathogenesis,treatment and class information hidden in the data of drug symptoms,which can reflect the general law of physicians'syndrome differentiation and treatment of insomnia.This method has a reference for the exploration of TCM clinical experience significance;The results of this study can provide feedback to guide the clinical diagnosis and treatment of insomnia.展开更多
In this study, 32 Luffa germplasm resources were collected from various regions in Zhejiang Province as experimental materials, to investigate 22 agronomic traits including fruit bearing habit, leaf margin, fruit ribb...In this study, 32 Luffa germplasm resources were collected from various regions in Zhejiang Province as experimental materials, to investigate 22 agronomic traits including fruit bearing habit, leaf margin, fruit ribbing and percentage of nodes with female flowers to total node. Based on the obtained experimental data, principal component analysis and cluster analysis were carried out using DPS software. The results showed that 22 agronomic traits could be integrated into 5 principal components, with the cumulative contributive percentage of 81. 308%. According to the correlations between the first five principal components and traits, 14 traits with great influences were screened. On the basis of principal component analysis, cluster analysis of 32 Luffa germplasm resources was conducted, which divided Luffa cylindrica and Luffa acutangula into two categories and six subcategories by Euclidean genetic distances. This study provided scientific basis for the collection, preservation, identification, creation and utilization of Luffa germplasm and parent selection in cross breeding of Luffa.展开更多
[ Objective] The aim was to explore the classification basis of Chinese local chicken for providing theoretical basis of giving full play to the genetic potential, heterosis, new ideas and methods of new varieties and...[ Objective] The aim was to explore the classification basis of Chinese local chicken for providing theoretical basis of giving full play to the genetic potential, heterosis, new ideas and methods of new varieties and strains breeding, identification and evaluation. [ Method] The multivariate statistical analysis of laying performance and production area ecology of 11 kind of chicken was done by principal component analysis and cluster analysis. E]Result] The cluster analysis of 4 egg laying performance indexs indicated that 11 kinds of chicken can be roughly classified into large chicken and small chicken; 10 indexs of multivariate statistical analysis indicated that 11 kinds of chicken can be classified into high altitude type and low altitude type when first 3 eigenvalues were selected as 3 principal component by principal component analysis( 94.99% of the total amount of information), and similarity coefficient was computed according to first three principal component values of each kind and then cluster analysis was done with nearest neighbor way. [ Conclusion] Ecological factor was also an important aspect of assortment.展开更多
Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the S...Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare these methods. We offer the correct syntax to deactivate the similarity algorithm for clustering analysis within the hierarchical clustering module of SPSS. Findings: When one inputs co-occurrence matrices into the data editor of the SPSS hierarchical clustering module without deactivating the embedded similarity algorithm, the program calculates similarity twice, and thus distorts and overestimates the degree of similarity. Practical implications: We offer the correct syntax to block the similarity algorithm for clustering analysis in the SPSS hierarchical clustering module in the case of co-occurrence matrices. This syntax enables researchers to avoid obtaining incorrect results. Originality/value: This paper presents a method of editing syntax to prevent the default use of a similarity algorithm for SPSS's hierarchical clustering module. This will help researchers, especially those from China, to properly implement the co-occurrence matrix when using SPSS for hierarchical cluster analysis, in order to provide more scientific and rational results.展开更多
文摘A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters.
基金The National Natural Science Foundation of China under contract Nos 42106005,91958203,41676131,41876155.
文摘The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait.
文摘Background:To solve the cluster analysis better,we propose a new method based on the chaotic particle swarm optimization(CPSO)algorithm.Methods:In order to enhance the performance in clustering,we propose a novel method based on CPSO.We first evaluate the clustering performance of this model using the variance ratio criterion(VRC)as the evaluation metric.The effectiveness of the CPSO algorithm is compared with that of the traditional particle swarm optimization(PSO)algorithm.The CPSO aims to improve the VRC value while avoiding local optimal solutions.The simulated dataset is set at three levels of overlapping:non-overlapping,partial overlapping,and severe overlapping.Finally,we compare CPSO with two other methods.Results:By observing the comparative results,our proposed CPSO method performs outstandingly.In the conditions of non-overlapping,partial overlapping,and severe overlapping,our method has the best VRC values of 1683.2,620.5,and 275.6,respectively.The mean VRC values in these three cases are 1683.2,617.8,and 222.6.Conclusion:The CPSO performed better than other methods for cluster analysis problems.CPSO is effective for cluster analysis.
文摘The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allows clustering variable objects into groups-clusters on the basis of similarity or dissimilarity. Cluster analysis involves computational procedures, of which purpose is to reduce a set of data on several relatively homogenous groups-clusters, while the condition of reduction is maximal and simultaneously minimal similarity of clusters. Similarity of objects is studied by the degree of similarity (correlation coefficient and association coefficient) or the degree of dissimilarity-degree of distance (distance coefficient). Methods of cluster analysis are on the basis of clustering classified as hierarchical or non-hierarchical methods.
基金funding from the National Natural Science Foundation of China(No.41572308)。
文摘The scientific and fair positioning of monitoring locations for surface displacement on slopes is a prerequisite for early warning and forecasting.However,there is no specific provision on how to effectively determine the number and location of monitoring points according to the actual deformation characteristics of the slope.There are still some defects in the layout of monitoring points.To this end,based on displacement data series and spatial location information of surface displacement monitoring points,by combining displacement series correlation and spatial distance influence factors,a spatial deformation correlation calculation model of slope based on clustering analysis was proposed to calculate the correlation between different monitoring points,based on which the deformation area of the slope was divided.The redundant monitoring points in each partition were eliminated based on the partition's outcome,and the overall optimal arrangement of slope monitoring points was then achieved.This method scientifically addresses the issues of slope deformation zoning and data gathering overlap.It not only eliminates human subjectivity from slope deformation zoning but also increases the efficiency and accuracy of slope monitoring.In order to verify the effectiveness of the method,a sand-mudstone interbedded CounterTilt excavation slope in the Chongqing city of China was used as the research object.Twenty-four monitoring points deployed on this slope were monitored for surface displacement for 13 months.The spatial location of the monitoring points was discussed.The results show that the proposed method of slope deformation zoning and the optimized placement of monitoring points are feasible.
文摘According to the aggregation method of experts' evaluation information in group decision-making,the existing methods of determining experts' weights based on cluster analysis take into account the expert's preferences and the consistency of expert's collating vectors,but they lack of the measure of information similarity.So it may occur that although the collating vector is similar to the group consensus,information uncertainty is great of a certain expert.However,it is clustered to a larger group and given a high weight.For this,a new aggregation method based on entropy and cluster analysis in group decision-making process is provided,in which the collating vectors are classified with information similarity coefficient,and the experts' weights are determined according to the result of classification,the entropy of collating vectors and the judgment matrix consistency.Finally,a numerical example shows that the method is feasible and effective.
基金CAS Action-plan for West Development, KZCX2-XB2-06-03 National Natural Science Foundation of China, No.30500064
文摘The summer day-by-day precipitation data of 97 meteorological stations on the Qinghai-Tibet Plateau from 1961 to 2004 were selected to analyze the temporal-spatial distribution through accumulated variance,correlation analysis,regression analysis,empirical orthogonal function,power spectrum function and spatial analysis tools of GIS.The result showed that summer precipitation occupied a relatively high proportion in the area with less annual precipitation on the Plateau and the correlation between summer precipitation and annual precipitation was strong.The altitude of these stations and summer precipitation tendency presented stronger positive correlation below 2000 m,with correlation value up to 0.604(α=0.01).The subtracting tendency values between 1961-1983 and 1984-2004 at five altitude ranges(2000-2500 m,2500-3000 m,3500-4000 m,4000-4500 m and above 4500 m)were above zero and accounted for 71.4%of the total.Using empirical orthogonal function, summer precipitation could be roughly divided into three precipitation pattern fields:the Southeast Plateau Pattern Field,the Northeast Plateau Pattern field and the Three Rivers' Headstream Regions Pattern Field.The former two ones had a reverse value from the north to the south and opposite line was along 35°N.The potential cycles of the three pattern fields were 5.33a,21.33a and 2.17a respectively,tested by the confidence probability of 90%.The station altitudes and summer precipitation potential cycles presented strong negative correlation in the stations above 4500 m,with correlation value of-0.626(α=0.01).In Three Rivers Headstream Regions summer precipitation cycle decreased as the altitude rose in the stations above 3500 m and increased as the altitude rose in those below 3500 m.The empirical orthogonal function analysis in June precipitation,July precipitation and August precipitation showed that the June precipitation pattern field was similar to the July's,in which southern Plateau was positive and northern Plateau negative.But positive value area in July precipitation pattern field was obviously less than June's.The August pattern field was totally opposite to June's and July's.The positive area in August pattern field jumped from the southern Plateau to the northern Plateau.
基金supported by the National Science and Technology Support Program(Grant No.2006BAD01A01-5)the Key Program of the Development of Variety of Genetically Modified Organisms(Grant No.2008ZX08001-006)+2 种基金Special Program for Rice Scientific Research,Ministry of Agriculture,China(Grant No.nyhyzx 07-001-006)the Key Support Program of Jiangsu Science and Technology(Grant No.BE2008354)Jiangsu Self-innovation Fund for Agricultural Science and Technology,China(GrantNo.CX[08]603)
文摘Diversity of 60 conventional japonica rice accessions with good eating quality at home and abroad was analyzed using SSR molecular markers, agronomic traits and taste characteristics. A total of 290 alleles were detected in the 60 accessions at 72 SSR loci with the high similarity coefficients varying between 0.600 and 0.924. The loci on chromosome 5 showed the greatest value in average allele number. Additionally, most of the SSR loci could detect 3 to 4 alleles. An UPGMA dendrogram based on the cluster analysis of the genetic similarity coefficients showed that the grouping trend of part of the rice accessions was geographic-related and most of the rice accessions in Jiangsu Province, China were clustered together. Furthermore, many domestic accessions from south and north origins in China were close to the foreign japonica rice varieties, as proved by their pedigree origin from the foreign high-quality sources. For taste characteristics, part of the accessions with excellent taste were clearly clustered into one category though they came from different geographical regions, which indicates that taste characteristics of some varieties were mainly genetically determined. In addition, the agronomic traits of japonica rice with good taste might be closely related with their geographical origins, but the relationship between superior taste characteristics and agronomic traits should be further clarified.
基金the National Natural Science Foundation of China under contract No.49976027 the Important Topic of Scientific Research of the State 0ceanic Administration, China, on the construction system of oil fingerprinting database and the key technology (from 2004 to 2005 ).
文摘By gas chromatogram, six crude oils fingerprinting distributed in four oilfields and four oil platforms were analyzed and the corre- sponding normal paraffin hydrocarbon ( including pristane and phytane) concentration was obtained by the internal standard methed. The normal paraffin hydrocarbon distribution patterns of six crude oils were built and compared. The cluster analysis on the normal paraffin hydrocarbon concentration was conducted for classification and some ratios of oils were used for oils comparison. The results indicated: there was a clear difference within different crude oils in different oil fields and a small difference between the crude oils in the same oil platform. The normal paraffin hydrocarbon distribution pattern and ratios, as well as the cluster analysis on the nomad paraffin hydrocarbon concentration can have a better differentiation result for the crude oils with small difference than the original gas chromatogram.
基金Supported by the Doctoral Fund of Northeast Agricultural University(2009RC41)Postdoctoral Grants of Heilongjiang Province(LBH-Z10265)
文摘For the first time, we used Tullgren method made a study on vertical migrating and cluster analysis of the soil mesofauna in Dongying Halophytes Garden in the Yellow River Delta (YRD), Shandong Province. The results showed that the soil mesofauna tended to gather on soil surface in most samples at most times, but the vertical migrating greatly varied in different seasons or environment conditions. Acari was the dominant group. The index of diversity of the soil fauna was correlated with the index of evenness. The Acari's number of individuals infected other species and numbers. Dominant group-Aeari made greater contribution to the result of cluster analysis, and there were significant differences between communities in different habitats by cluster analysis with both Bray-Curtis and Jaccard similarity coefficient.
基金Support for this work was through NIH NIDCR NRSA #T32-DE07255
文摘Aim Cluster analysis was conducted on data from 5,169 United States (U.S.) Arizona children, age's 5-59-months with the goal of delineating patterns of caries in the primary dentition of pre-school children without a priori pattern definitions. Methodology Cluster analyses were conducted using all data for children ages 0-4 years in aggregate: 1) for all subjects, and 2) for subjects without crowned restored teeth. Each of these two sets of analyses consisted of 8 differently specified cluster analyses as a validation procedure. Results The caries patterns identified from the clustering analysis are: 1) smooth surfaces (other than the maxillary incisor), 2) maxillary incisor, 3) occlusal surfaces of first molars, and 4) pit and fissure surfaces of second molars. Conclusion The cluster analysis findings were consistent with results produced by multidimensional scaling. These cross-validated patterns may represent resulting disease conditions from different risks or the timing of various risk factor exposures. As such, the patterns may be useful case definitions for caries risk factor investigations in children under 60 months of age.
基金financially supported by the National Basic Research Program of China(973 Program)(No. 2012CB822003)the Specialized Research Fund for the Doctoral Program of Higher Education(No. 20120211110022)+2 种基金the National Natural Science Foundation of China(No.41402007)the Fundamental Research Funds for the Central Universities(No.lzujbky2016-201)the US Louisiana Board of Regents under grant LEQSF(2017-20)-RD-A-29
文摘Cupressinocladus Seward is a fossil genus of conifers and conifer fossils with reproductive organs are very rare. In general, it is difficult to understand the natural affinities with other conifers. In this paper, a new species, Cupressinocladus guyangensis P.H. Jin et B.N. Sun sp. nov., is reported based on branches with immature female cones from the Lower Cretaceous Guyang Formation of the Guyang Basin in Inner Mongolia, northern China. The foliage shoots are decussate. Leaves are decussate, imbricate, scale-like, weakly dimorphic, and bear longitudinal glands on the abaxial view. Stomata complexes are haplocheilic, monocyclic, irregularly arranged, and spread along the leaf margin. Immature female cones are subglobose with 6-8 cone scales, and three subglobose ovules arranged in a row at the base of the cone scales. Moreover, we performed cluster analysis using a statistics and machine learning toolbox for 23 fossils and extant species based on 16 morphological characters. The result implies that the new species bears a close resemblance to the extant Cupressusfunebris Endl. and might have nearest systematic affinities to it.
文摘Cluster analysis in spectroscopy presents some unique challenges due to the specific data characteristics in spectroscopy,namely,high dimensionality and small sample size.In order to improve cluster analysis outcomes,feature selection can be used to remove redundant or irrelevant features and reduce the dimensionality.However,for cluster analysis,this must be done in an unsupervised manner without the benefit of data labels.This paper presents a novel feature selection approach for cluster analysis,utilizing clusterability metrics to remove features that least contribute to a dataset’s tendency to cluster.Two versions are presented and evaluated:The Hopkins clusterability filter which utilizes the Hopkins test for spatial randomness and the Dip clusterability filter which utilizes the Dip test for unimodality.These new techniques,along with a range of existing filter and wrapper feature selection techniques were evaluated on eleven real-world spectroscopy datasets using internal and external clustering indices.Our newly proposed Hopkins clusterability filter performed the best of the six filter techniques evaluated.However,it was observed that results varied greatly for different techniques depending on the specifics of the dataset and the number of features selected,with significant instability observed for most techniques at low numbers of features.It was identified that the genetic algorithm wrapper technique avoided this instability,performed consistently across all datasets and resulted in better results on average than utilizing the all the features in the spectra.
基金This research is supported by the Commonwealth of Australia as represented by the Defence Science and Technology Group of the Department of Defence,and by an Australian Government Research Training Program(RTP)Scholarship。
文摘Supervised machine learning techniques have become well established in the study of spectroscopy data.However,the unsupervised learning technique of cluster analysis hasn’t reached the same level maturity in chemometric analysis.This paper surveys recent studies which apply cluster analysis to NIR and IR spectroscopy data.In addition,we summarize the current practices in cluster analysis of spectroscopy and contrast these with cluster analysis literature from the machine learning and pattern recognition domain.This includes practices in data pre-processing,feature extraction,clustering distance metrics,clustering algorithms and validation techniques.Special consideration is given to the specific characteristics of IR and NIR spectroscopy data which typically includes high dimensionality and relatively low sample size.The findings highlighted a lack of quantitative analysis and evaluation in current practices for cluster analysis of IR and NIR spectroscopy data.With this in mind,we propose an analysis model or workflow with techniques specifically suited for cluster analysis of IR and NIR spectroscopy data along with a pragmatic application strategy.
文摘The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigated. Results showed that increased nitrogen rates, water regimes and population densities and decreased seedling ages could enhance reflectance at NIR (near infrared) bands and reduce reflectance at visible bands. Using reflectance of green, red and NIR band and ratio index of 810-560 nm could distinguish the different type of rice by fuzzy cluster analysis,
基金Supported by Platform Construction for Germplasm Resources of China Tobacco (2007, 152)
文摘Correlation and path coefficient analyses were conducted for 10 characteristics of 24 pure lines of flue-cured tobacco such as plant height, knot distance, leaf number, the central leaf length and width, ratio of the length to width, stem girth, dates of budding, leaf yield and ratio of the prime-medium tobacco. The leaf number and the central leaf length showed a positive or a strong positive correlation with the yield per plant. And the leaf number and leaf yield per plant showed a strong positive correlation with the ratio of prime-medium tobacco. The results showed that the leaf yield per plant among these characteristics played a major role in determining the ratio of prime-medium tobacco while the others were less related with the ratio. Square sum of deviation method cluster analyses showed that 24 pure lines of flue-cured tobacco were clustered into two groups. Of the pure lines, Line T1706 and Line T1245 had a far relationship with all other lines, and also had a heterosis when crossed with the other lines. Lines Guangdonghuang 1 and R72(3)B-2-1 were closely related.
基金Traditional Chinese Medicine Inheritance and Innovation"Hundreds of Millions"Talent Project(QiHuang Project)-Qihuang Scholars(National Education and Development of Traditional Chinese Medicine[2018]No.12)。
文摘Objective:To explore the general differentiation and treatment of insomnia by Professor Gao Ying through drug clustering and group correspondence analysis,and provide reference for clinical diagnosis and treatment.Methods:Collect retrospective case data from outpatient system,use SPSS20.0 software to perform frequency and cluster analysis on high-frequency symptoms and drug data,and perform corresponding analysis on the clustered drug syndrome groups.Results:A total of 349 consultations in 204 patients were included.Cluster analysis of 35 symptoms and 40 flavors with a frequency of more than 10%resulted in a corresponding relationship between 7 symptom groups,6 drug groups and 5 drug syndrome groups.The medicine symptom group has a high degree of matching;the doctors distinguish and tre at insomnia with calming,clearing heat,nourishing yin,liver,spleen,qi and phlegm as the core treatment,with consistent decoction,two to pill,lily ground Huang Tang,Lily Zhimu Decoction,Wendan Decoction,Sini San,Xiao Chai Hu Tang,Xiaoyao San,etc.are commonly used prescriptions;the physician's experience is to add or subtract Danshen and Zao Ren drink,which has a wide range of applicability to various insomnia syndrome.Conclusion:Based on the cluster analysis of drug symptoms and group correspondence analysis,it can reveal the pathogenesis,treatment and class information hidden in the data of drug symptoms,which can reflect the general law of physicians'syndrome differentiation and treatment of insomnia.This method has a reference for the exploration of TCM clinical experience significance;The results of this study can provide feedback to guide the clinical diagnosis and treatment of insomnia.
基金Supported by"San Nong Liu Fang"Science and Technology Cooperation Project of Zhejiang Province(ZNJF[2011]No.85)Major Project of Science and Technology of Zhejiang Province(2009C2006-1-8)
文摘In this study, 32 Luffa germplasm resources were collected from various regions in Zhejiang Province as experimental materials, to investigate 22 agronomic traits including fruit bearing habit, leaf margin, fruit ribbing and percentage of nodes with female flowers to total node. Based on the obtained experimental data, principal component analysis and cluster analysis were carried out using DPS software. The results showed that 22 agronomic traits could be integrated into 5 principal components, with the cumulative contributive percentage of 81. 308%. According to the correlations between the first five principal components and traits, 14 traits with great influences were screened. On the basis of principal component analysis, cluster analysis of 32 Luffa germplasm resources was conducted, which divided Luffa cylindrica and Luffa acutangula into two categories and six subcategories by Euclidean genetic distances. This study provided scientific basis for the collection, preservation, identification, creation and utilization of Luffa germplasm and parent selection in cross breeding of Luffa.
基金funded by the national"863"program(2011AA100305)Zhejiang Province Science and Technology Platform Project(2011E60003)+1 种基金Yangzhou City Science and Technology Plan(YZ2011069)Natural Science Foundation of Jiangsu Province(BK2011431)
文摘[ Objective] The aim was to explore the classification basis of Chinese local chicken for providing theoretical basis of giving full play to the genetic potential, heterosis, new ideas and methods of new varieties and strains breeding, identification and evaluation. [ Method] The multivariate statistical analysis of laying performance and production area ecology of 11 kind of chicken was done by principal component analysis and cluster analysis. E]Result] The cluster analysis of 4 egg laying performance indexs indicated that 11 kinds of chicken can be roughly classified into large chicken and small chicken; 10 indexs of multivariate statistical analysis indicated that 11 kinds of chicken can be classified into high altitude type and low altitude type when first 3 eigenvalues were selected as 3 principal component by principal component analysis( 94.99% of the total amount of information), and similarity coefficient was computed according to first three principal component values of each kind and then cluster analysis was done with nearest neighbor way. [ Conclusion] Ecological factor was also an important aspect of assortment.
文摘Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare these methods. We offer the correct syntax to deactivate the similarity algorithm for clustering analysis within the hierarchical clustering module of SPSS. Findings: When one inputs co-occurrence matrices into the data editor of the SPSS hierarchical clustering module without deactivating the embedded similarity algorithm, the program calculates similarity twice, and thus distorts and overestimates the degree of similarity. Practical implications: We offer the correct syntax to block the similarity algorithm for clustering analysis in the SPSS hierarchical clustering module in the case of co-occurrence matrices. This syntax enables researchers to avoid obtaining incorrect results. Originality/value: This paper presents a method of editing syntax to prevent the default use of a similarity algorithm for SPSS's hierarchical clustering module. This will help researchers, especially those from China, to properly implement the co-occurrence matrix when using SPSS for hierarchical cluster analysis, in order to provide more scientific and rational results.