In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
[Objectives] This study aimed to establish HPLC fingerprint and conduct cluster analysis and principle component analysis for Citri Reticulatae Pericarpium Viride. [Methods] Using the HPLC method, the determination wa...[Objectives] This study aimed to establish HPLC fingerprint and conduct cluster analysis and principle component analysis for Citri Reticulatae Pericarpium Viride. [Methods] Using the HPLC method, the determination was performed on XSelect~® HSS T3-C_(18) column with mobile phase of acetonitrile-0.5% acetic acid solution(gradient elution) at the flow rate of 1.0 mL/min. The detection wavelength was 360 nm. The column temperature was 25℃. The sample size was 10 μL. With peak of hesperidin as the reference, HPLC fingerprints of 10 batches of Citri Reticulatae Pericarpium Viride were determined. The similarity of the 10 batches of samples was evaluated by Similarity Evaluation System for Chromatographic Fingerprint of TCM(2012 edition) to determine the common peaks. Cluster analysis and principal component analysis were performed by using SPSS 17.0 statistical software. [Results] The HPLC fingerprints of the 10 batches of medicinal materials had total 11 common peaks, and the similarity was 0.919-1.000, indicating that the chemical composition of the 10 batches of medicinal materials was consistent. There were 11 common components in the 10 batches of medicinal materials, but their contents were different. When the Euclidean distance was 20, the 10 batches of samples were divided into two categories, S4 in the first category, and the others in the second one. When the Euclidean distance was 5, the second category could be further divided into two sub-categories, S1 and S10 in one sub-category, and S2, S3, S5, S6, S7, S8 and S9 in the other one. The principle component analysis showed that cumulative contribution rate of the two main component factors was 92.797%, and the comprehensive score of S7 was the highest with the best quality. [Conclusions] The results of HPLC fingerprinting, cluster analysis and principle component analysis can provide reference for the quality control of Citri Reticulatae Pericarpium Viride.展开更多
[Objective] This study aimed to investigate the trace elements in Rehman- nia glutinosa Libosch. by using principal component analysis and clustering analysis. [Method] Principal component analysis and clustering anal...[Objective] This study aimed to investigate the trace elements in Rehman- nia glutinosa Libosch. by using principal component analysis and clustering analysis. [Method] Principal component analysis and clustering analysis of R. glutinosa medicinal materials from different sources were conducted with contents of six trace elements as indices. [Result] The principal component analysis could comprehen- sively evaluate the quality of R. glutinosa samples with objective results which was consistent with the results of clustering analysis. [Conclusion] Principal component analysis and clustering analysis methods can be used for the quality evaluation of Chinese medicinal materials with multiple indices.展开更多
Clustering analysis identifying unknown heterogenous subgroups of a population(or a sample)has become increasingly popular along with the popularity of machine learning techniques.Although there are many software pack...Clustering analysis identifying unknown heterogenous subgroups of a population(or a sample)has become increasingly popular along with the popularity of machine learning techniques.Although there are many software packages running clustering analysis,there is a lack of packages conducting clustering analysis within a structural equation modeling framework.The package,gscaLCA which is implemented in the R statistical computing environment,was developed for conducting clustering analysis and has been extended to a latent variable modeling.More specifically,by applying both fuzzy clustering(FC)algorithm and generalized structured component analysis(GSCA),the package gscaLCA computes membership prevalence and item response probabilities as posterior probabilities,which is applicable in mixture modeling such as latent class analysis in statistics.As a hybrid model between data clustering in classifications and model-based mixture modeling approach,fuzzy clusterwise GSCA,denoted as gscaLCA,encompasses many advantages from both methods:(1)soft partitioning from FC and(2)efficiency in estimating model parameters with bootstrap method via resolution of global optimization problem from GSCA.The main function,gscaLCA,works for both binary and ordered categorical variables.In addition,gscaLCA can be used for latent class regression as well.Visualization of profiles of latent classes based on the posterior probabilities is also available in the package gscaLCA.This paper contributes to providing a methodological tool,gscaLCA that applied researchers such as social scientists and medical researchers can apply clustering analysis in their research.展开更多
This paper aims to deepen the quality of life of people with celiac disease with a focus on compliance to the diet through Principle Component Analysis and Analyse des Données. In particular, we will try to under...This paper aims to deepen the quality of life of people with celiac disease with a focus on compliance to the diet through Principle Component Analysis and Analyse des Données. In particular, we will try to understand whether these analyzes are also applicable in the context of research web2.0 carried out with web-survey.展开更多
In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level....In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level. Solving the problem of employment for the people is an important prerequisite for their peaceful living and work, as well as a prerequisite and foundation for building a harmonious society. The employment situation of private enterprises has always been of great concern to the outside world, and these two major jobs have always occupied an important position in the employment field of China that cannot be ignored. With the establishment of the market economy system, individual and private enterprises have become important components of the socialist economy, making significant contributions to economic development and social progress. The rapid development of China’s economy, on the one hand, is the embodiment of the superiority of China’s socialist market economic system, and on the other hand, it is the role of the tertiary industry and private enterprises in promoting the national economy. Since the 1990s, China’s private enterprises have become a new economic growth point for local and even national countries, and are one of the important ways to arrange employment and achieve social stability. This paper studies the employment of private enterprises and individuals from the perspective of statistics, extracts relevant data from China statistical Yearbook, uses the relevant knowledge of statistics to process the data, obtains the conclusion and puts forward relevant constructive suggestions.展开更多
Bitter tea is a special kind of tea germplasm in China.The major biochemical components of 24 bitter teas and other 8 Camellia sinensis var.sinensis and 8 C.sinensis var.assamica tea germplasms,which were stored in th...Bitter tea is a special kind of tea germplasm in China.The major biochemical components of 24 bitter teas and other 8 Camellia sinensis var.sinensis and 8 C.sinensis var.assamica tea germplasms,which were stored in the China National Germplasm Hangzhou Tea Repository(CNGHTR),were analyzed and evaluated.The results showed that no significant differences of major biochemical components affecting the tea quality were found between bitter tea and common tea.According to the processing suitability index,bitter tea was suitable for the manufacturing of black tea;while according to evolutionary indices such as the composition and content of catechin,bitter tea was similar to C.sinensis var.assamica belonging to the relatively primitive type in evolution.The results of cluster analysis indicated that bitter tea was clustered with C.sinensis var.assamica,so it could be considered to belong to C.sinensis var.assamica.展开更多
Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to ass...Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to assign principal component factor based on cluster analysis of loading matrix and combining with actual meaning and evaluation direction of index categories. To evaluate the richness of its nutrition according to the score of nutrition of fruit and vegetable, finally equivalent replacement suggestions are given in different seasons of vegetables & fruits according to the result of clustering. Studies show that principal component cluster method can not only carry on the reasonable classification of multivariate data effectively, but also make reasonable evaluation on the sample object, and provide powerful basis for evaluation of fruits and vegetables’ nutrition.展开更多
Water quality monitoring has one of the highest priorities in surface water protection policy. Many variety approaches are being used to interpret and analyze the concealed variables that determine the variance of obs...Water quality monitoring has one of the highest priorities in surface water protection policy. Many variety approaches are being used to interpret and analyze the concealed variables that determine the variance of observed water quality of various source points. A considerable proportion of these approaches are mainly based on statistical methods, multivariate statistical techniques in particular. In the present study, the use of multivariate techniques is required to reduce the large variables number of Nile River water quality upstream Cairo Drinking Water Plants (CDWPs) and determination of relationships among them for easy and robust evaluation. By means of multivariate statistics of principal components analysis (PCA), Fuzzy C-Means (FCM) and K-means algorithm for clustering analysis, this study attempted to determine the major dominant factors responsible for the variations of Nile River water quality upstream Cairo Drinking Water Plants (CDWPs). Furthermore, cluster analysis classified 21 sampling stations into three clusters based on similarities of water quality features. The result of PCA shows that 6 principal components contain the key variables and account for 75.82% of total variance of the study area surface water quality and the dominant water quality parameters were: Conductivity, Iron, Biological Oxygen Demand (BOD), Total Coliform (TC), Ammonia (NH3), and pH. However, the results from both of FCM clustering and K-means algorithm, based on the dominant parameters concentrations, determined 3 cluster groups and produced cluster centers (prototypes). Based on clustering classification, a noted water quality deteriorating as the cluster number increased from 1 to 3. However the cluster grouping can be used to identify the physical, chemical and biological processes creating the variations in the water quality parameters. This study revealed that multivariate analysis techniques, as the extracted water quality dominant parameters and clustered information can be used in reducing the number of sampling parameters on the Nile River in a cost effective and efficient way instead of using a large set of parameters without missing much information. These techniques can be helpful for decision makers to obtain a global view on the water quality in any surface water or other water bodies when analyzing large data sets especially without a priori knowledge about relationships between them.展开更多
In this study, 32 Luffa germplasm resources were collected from various regions in Zhejiang Province as experimental materials, to investigate 22 agronomic traits including fruit bearing habit, leaf margin, fruit ribb...In this study, 32 Luffa germplasm resources were collected from various regions in Zhejiang Province as experimental materials, to investigate 22 agronomic traits including fruit bearing habit, leaf margin, fruit ribbing and percentage of nodes with female flowers to total node. Based on the obtained experimental data, principal component analysis and cluster analysis were carried out using DPS software. The results showed that 22 agronomic traits could be integrated into 5 principal components, with the cumulative contributive percentage of 81. 308%. According to the correlations between the first five principal components and traits, 14 traits with great influences were screened. On the basis of principal component analysis, cluster analysis of 32 Luffa germplasm resources was conducted, which divided Luffa cylindrica and Luffa acutangula into two categories and six subcategories by Euclidean genetic distances. This study provided scientific basis for the collection, preservation, identification, creation and utilization of Luffa germplasm and parent selection in cross breeding of Luffa.展开更多
This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverag...This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.展开更多
Gold mining is now widely acknowledged as one of the significant sources of soil pollution in developed countries. In developing countries, the sources and levels of soil contamination have not been thoroughly address...Gold mining is now widely acknowledged as one of the significant sources of soil pollution in developed countries. In developing countries, the sources and levels of soil contamination have not been thoroughly addressed. Thus, this study was intended to determine the source of soil pollution and the level of contamination in the active and closed gold mining areas. The research paper presents the pollution load of heavy metals (lead-Pb, chromium-Cr, cadmium-Cd, copper-Cu, arsenic-As, manganese-Mn, and nickel-Ni) in 90 soil samples collected from the studied sites. Multivariate statistical analysis, including Principal Component Analysis (PCA) and Cluster Analysis (CA), coupled with correlation coefficient analysis, was performed to determine the possible sources of pollution in the study areas. The results indicated that Pb, Cr, Cu and Mn come from different sources than Cd, As and Ni. The results obtained from the metal pollution assessment using the Pollution Index (PI) and the Geoaccumulation Index (Igeo) confirmed that soils in the mining areas were contaminated in the range from moderately through strongly to highly contaminated soils. This study verified that soil contamination in the gold mining areas results from natural and anthropogenic processes. The current study findings would enhance our knowledge regarding the soil contamination level in the mining areas and the source of contamination. It is recommended to use PCA, CA, PI and Igeo to assess and monitor the heavy metal contaminated soil in gold mining areas.展开更多
Based on 10 years of statistics concerning economic development in Xi'an as the main part of Guanzhong- Tianshui Economic Zone, this article builds the main indicator system to reflect economic development. Using ...Based on 10 years of statistics concerning economic development in Xi'an as the main part of Guanzhong- Tianshui Economic Zone, this article builds the main indicator system to reflect economic development. Using two mathematical methods( principal component analysis and cluster analysis),we carry out comprehensive evaluation analysis of the main economic indicators,point out the distribution differences in the economic development level in this region,and make classification,in order to provide a scientific basis for the decision- making body to lay down the relevant economic development strategies in accordance with the economic development level and geographical location.展开更多
This study examined public attitudes concerning the value of outdoor spaces which people use daily. Two successive analyses were performed based on data from common residents and college students in the city of Hangzh...This study examined public attitudes concerning the value of outdoor spaces which people use daily. Two successive analyses were performed based on data from common residents and college students in the city of Hangzhou, China. First, citizens registered various items constituting desirable values of residential outdoor spaces through a preliminary questionnaire. The result proposed three general attributes (functional, aesthetic and ecological) and ten specific qualities of residential outdoor spaces. An analytic hierarchy process (AHP) was applied to an interview survey in order to clarify the weights among these attributes and qualities. Second, principal factors were extracted from the ten specific qualities with principal component analysis (PCA) for both the common case and the campus case. In addition, the variations of respondents’ groups were classified with cluster analysis (CA) using the results of the PCA. The results of the AHP application found that the public prefers the functional attribute, rather than the aesthetic attribute. The latter is always viewed as the core value of open spaces in the eyes of architects and designers. Fur-thermore, comparisons of ten specific qualities showed that the public prefers the open spaces that can be utilized conveniently and easily for group activities, because such spaces keep an active lifestyle of neighborhood communication, which is also seen to protect human-regarding residential environments. Moreover, different groups of respondents diverge largely in terms of gender, age, behavior and preference.展开更多
With 16 Yunnan tea tree varieties and 5 Kenya tea tree varieties as test materials,the differences in biochemical components between Yunnan and Kenya tea tree varieties were compared and analyzed.The results showed th...With 16 Yunnan tea tree varieties and 5 Kenya tea tree varieties as test materials,the differences in biochemical components between Yunnan and Kenya tea tree varieties were compared and analyzed.The results showed that the coefficients of variation of tea polyphenols,amino acids,caffeine,water extract,gallic acid(GA),catechin(C),epicatechin(EC),epicatechin gallate(ECG),epigallocatechin(EGC),epigallocatechin gallate(EGCG)and total catechins in Yunnan tea tree varieties were greater than those in Kenyan tea trees.The contents of tea polyphenols,amino acids,caffeine,water extract,C,EC,EGC,EGCG and total catechins in Yunnan tea tree varieties had no significant differences from those in Kenyan tea trees varieties(P>0.05),while there were significant differences in the contents of GA and ECG between Yunnan tea tree varieties and Kenya tea tree varieties(P<0.05).Therefore,it could be predicted that GA and ECG might be one of the main characteristics of the differences in biochemical components between Yunnan tea tree varieties and Kenyan tea tree varieties.The cluster analysis results showed that when the genetic distance was 15,the 21 tested tea varieties could be divided into three groups with obvious biochemical differences.展开更多
With the development of power grid, as one of the key equipment, the transformer’s condition assessment method has always receive attention from experts, scholars concern more and more about the method’s practicalit...With the development of power grid, as one of the key equipment, the transformer’s condition assessment method has always receive attention from experts, scholars concern more and more about the method’s practicality and reliability. In the traditional condition assessment method, due to the characteristics of the transformer’s complex structure, the assessment system is not comprehensive enough, or the assessment system is too complex, the indexes are not easy to quantify, such problems are emerging. The traditional method is complex and the degree of quantification is not enough. Therefore it is necessary to propose a condition assessment method that is easy to carry out the condition assessment work and does not affect the assessment results. In this paper, we propose a method to assess the state of the transformer’s complex structure. First, we establish a comprehensive assessment system, then apply the method of principal component analysis to optimize the index system, and then use the theory of cloud-matter-element. Finally the reliability and rationality of the method are verified by an example.展开更多
The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a ...The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.展开更多
Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducte...Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.展开更多
Glass is the precious material evidence of the trade of the early Silk Road. The ancient glass was easily affected by the environmental impact and weathering, and the change of composition ratios affected the correct ...Glass is the precious material evidence of the trade of the early Silk Road. The ancient glass was easily affected by the environmental impact and weathering, and the change of composition ratios affected the correct judgment of its category. In this paper, mathematical models and methods such as Chi-square test, weighted average method, principal component analysis, cluster analysis, binary classification model and grey correlation analysis were used comprehensively to analyze the data of sample glass products combined with their categories. The results showed that the weathered high-potassium glass could be divided into 12, 9, 10 and 27, 7, 22 and so on.展开更多
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
基金Supported by National Natural Science Foundation of China(81603251)Key Research and Development Plan of Shanxi Province(201603D3113021)Project of Collaborative Innovation Center for the Comprehensive Development and Utilization of Medicinal Herbs in Shanxi Province(2017-JYXT-05)
文摘[Objectives] This study aimed to establish HPLC fingerprint and conduct cluster analysis and principle component analysis for Citri Reticulatae Pericarpium Viride. [Methods] Using the HPLC method, the determination was performed on XSelect~® HSS T3-C_(18) column with mobile phase of acetonitrile-0.5% acetic acid solution(gradient elution) at the flow rate of 1.0 mL/min. The detection wavelength was 360 nm. The column temperature was 25℃. The sample size was 10 μL. With peak of hesperidin as the reference, HPLC fingerprints of 10 batches of Citri Reticulatae Pericarpium Viride were determined. The similarity of the 10 batches of samples was evaluated by Similarity Evaluation System for Chromatographic Fingerprint of TCM(2012 edition) to determine the common peaks. Cluster analysis and principal component analysis were performed by using SPSS 17.0 statistical software. [Results] The HPLC fingerprints of the 10 batches of medicinal materials had total 11 common peaks, and the similarity was 0.919-1.000, indicating that the chemical composition of the 10 batches of medicinal materials was consistent. There were 11 common components in the 10 batches of medicinal materials, but their contents were different. When the Euclidean distance was 20, the 10 batches of samples were divided into two categories, S4 in the first category, and the others in the second one. When the Euclidean distance was 5, the second category could be further divided into two sub-categories, S1 and S10 in one sub-category, and S2, S3, S5, S6, S7, S8 and S9 in the other one. The principle component analysis showed that cumulative contribution rate of the two main component factors was 92.797%, and the comprehensive score of S7 was the highest with the best quality. [Conclusions] The results of HPLC fingerprinting, cluster analysis and principle component analysis can provide reference for the quality control of Citri Reticulatae Pericarpium Viride.
基金Supported by Fund of Sichuan Provincial Administration of traditional Chinese Medicine(2008-12)~~
文摘[Objective] This study aimed to investigate the trace elements in Rehman- nia glutinosa Libosch. by using principal component analysis and clustering analysis. [Method] Principal component analysis and clustering analysis of R. glutinosa medicinal materials from different sources were conducted with contents of six trace elements as indices. [Result] The principal component analysis could comprehen- sively evaluate the quality of R. glutinosa samples with objective results which was consistent with the results of clustering analysis. [Conclusion] Principal component analysis and clustering analysis methods can be used for the quality evaluation of Chinese medicinal materials with multiple indices.
基金supported by the Yonsei University Research Fund of 2021(2021-22-0060).
文摘Clustering analysis identifying unknown heterogenous subgroups of a population(or a sample)has become increasingly popular along with the popularity of machine learning techniques.Although there are many software packages running clustering analysis,there is a lack of packages conducting clustering analysis within a structural equation modeling framework.The package,gscaLCA which is implemented in the R statistical computing environment,was developed for conducting clustering analysis and has been extended to a latent variable modeling.More specifically,by applying both fuzzy clustering(FC)algorithm and generalized structured component analysis(GSCA),the package gscaLCA computes membership prevalence and item response probabilities as posterior probabilities,which is applicable in mixture modeling such as latent class analysis in statistics.As a hybrid model between data clustering in classifications and model-based mixture modeling approach,fuzzy clusterwise GSCA,denoted as gscaLCA,encompasses many advantages from both methods:(1)soft partitioning from FC and(2)efficiency in estimating model parameters with bootstrap method via resolution of global optimization problem from GSCA.The main function,gscaLCA,works for both binary and ordered categorical variables.In addition,gscaLCA can be used for latent class regression as well.Visualization of profiles of latent classes based on the posterior probabilities is also available in the package gscaLCA.This paper contributes to providing a methodological tool,gscaLCA that applied researchers such as social scientists and medical researchers can apply clustering analysis in their research.
文摘This paper aims to deepen the quality of life of people with celiac disease with a focus on compliance to the diet through Principle Component Analysis and Analyse des Données. In particular, we will try to understand whether these analyzes are also applicable in the context of research web2.0 carried out with web-survey.
文摘In the past 30 years, Chinese enterprises have been a hot topic of discussion and concern among the general public in terms of economic and social status, ownership structure, business mechanism, and management level. Solving the problem of employment for the people is an important prerequisite for their peaceful living and work, as well as a prerequisite and foundation for building a harmonious society. The employment situation of private enterprises has always been of great concern to the outside world, and these two major jobs have always occupied an important position in the employment field of China that cannot be ignored. With the establishment of the market economy system, individual and private enterprises have become important components of the socialist economy, making significant contributions to economic development and social progress. The rapid development of China’s economy, on the one hand, is the embodiment of the superiority of China’s socialist market economic system, and on the other hand, it is the role of the tertiary industry and private enterprises in promoting the national economy. Since the 1990s, China’s private enterprises have become a new economic growth point for local and even national countries, and are one of the important ways to arrange employment and achieve social stability. This paper studies the employment of private enterprises and individuals from the perspective of statistics, extracts relevant data from China statistical Yearbook, uses the relevant knowledge of statistics to process the data, obtains the conclusion and puts forward relevant constructive suggestions.
基金Supported by the"Study on High Efficiency Machining and Multiple Utilization Technology of Tea Germplasm Resource"of National Science&Technology Supporting Project(2006BAD06B01)"Data Standard of Perennial and Vegetative Propagation Crop Germplasm Resources as a Share Experimental Unit"of National Fundamental Resources Platform of Science&Technology Project(2005DKA21002-08)~~
文摘Bitter tea is a special kind of tea germplasm in China.The major biochemical components of 24 bitter teas and other 8 Camellia sinensis var.sinensis and 8 C.sinensis var.assamica tea germplasms,which were stored in the China National Germplasm Hangzhou Tea Repository(CNGHTR),were analyzed and evaluated.The results showed that no significant differences of major biochemical components affecting the tea quality were found between bitter tea and common tea.According to the processing suitability index,bitter tea was suitable for the manufacturing of black tea;while according to evolutionary indices such as the composition and content of catechin,bitter tea was similar to C.sinensis var.assamica belonging to the relatively primitive type in evolution.The results of cluster analysis indicated that bitter tea was clustered with C.sinensis var.assamica,so it could be considered to belong to C.sinensis var.assamica.
文摘Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to assign principal component factor based on cluster analysis of loading matrix and combining with actual meaning and evaluation direction of index categories. To evaluate the richness of its nutrition according to the score of nutrition of fruit and vegetable, finally equivalent replacement suggestions are given in different seasons of vegetables & fruits according to the result of clustering. Studies show that principal component cluster method can not only carry on the reasonable classification of multivariate data effectively, but also make reasonable evaluation on the sample object, and provide powerful basis for evaluation of fruits and vegetables’ nutrition.
文摘Water quality monitoring has one of the highest priorities in surface water protection policy. Many variety approaches are being used to interpret and analyze the concealed variables that determine the variance of observed water quality of various source points. A considerable proportion of these approaches are mainly based on statistical methods, multivariate statistical techniques in particular. In the present study, the use of multivariate techniques is required to reduce the large variables number of Nile River water quality upstream Cairo Drinking Water Plants (CDWPs) and determination of relationships among them for easy and robust evaluation. By means of multivariate statistics of principal components analysis (PCA), Fuzzy C-Means (FCM) and K-means algorithm for clustering analysis, this study attempted to determine the major dominant factors responsible for the variations of Nile River water quality upstream Cairo Drinking Water Plants (CDWPs). Furthermore, cluster analysis classified 21 sampling stations into three clusters based on similarities of water quality features. The result of PCA shows that 6 principal components contain the key variables and account for 75.82% of total variance of the study area surface water quality and the dominant water quality parameters were: Conductivity, Iron, Biological Oxygen Demand (BOD), Total Coliform (TC), Ammonia (NH3), and pH. However, the results from both of FCM clustering and K-means algorithm, based on the dominant parameters concentrations, determined 3 cluster groups and produced cluster centers (prototypes). Based on clustering classification, a noted water quality deteriorating as the cluster number increased from 1 to 3. However the cluster grouping can be used to identify the physical, chemical and biological processes creating the variations in the water quality parameters. This study revealed that multivariate analysis techniques, as the extracted water quality dominant parameters and clustered information can be used in reducing the number of sampling parameters on the Nile River in a cost effective and efficient way instead of using a large set of parameters without missing much information. These techniques can be helpful for decision makers to obtain a global view on the water quality in any surface water or other water bodies when analyzing large data sets especially without a priori knowledge about relationships between them.
基金Supported by"San Nong Liu Fang"Science and Technology Cooperation Project of Zhejiang Province(ZNJF[2011]No.85)Major Project of Science and Technology of Zhejiang Province(2009C2006-1-8)
文摘In this study, 32 Luffa germplasm resources were collected from various regions in Zhejiang Province as experimental materials, to investigate 22 agronomic traits including fruit bearing habit, leaf margin, fruit ribbing and percentage of nodes with female flowers to total node. Based on the obtained experimental data, principal component analysis and cluster analysis were carried out using DPS software. The results showed that 22 agronomic traits could be integrated into 5 principal components, with the cumulative contributive percentage of 81. 308%. According to the correlations between the first five principal components and traits, 14 traits with great influences were screened. On the basis of principal component analysis, cluster analysis of 32 Luffa germplasm resources was conducted, which divided Luffa cylindrica and Luffa acutangula into two categories and six subcategories by Euclidean genetic distances. This study provided scientific basis for the collection, preservation, identification, creation and utilization of Luffa germplasm and parent selection in cross breeding of Luffa.
基金Funded by 973 Program of Ministry of National Defense of China(Grant No.613237)
文摘This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.
文摘Gold mining is now widely acknowledged as one of the significant sources of soil pollution in developed countries. In developing countries, the sources and levels of soil contamination have not been thoroughly addressed. Thus, this study was intended to determine the source of soil pollution and the level of contamination in the active and closed gold mining areas. The research paper presents the pollution load of heavy metals (lead-Pb, chromium-Cr, cadmium-Cd, copper-Cu, arsenic-As, manganese-Mn, and nickel-Ni) in 90 soil samples collected from the studied sites. Multivariate statistical analysis, including Principal Component Analysis (PCA) and Cluster Analysis (CA), coupled with correlation coefficient analysis, was performed to determine the possible sources of pollution in the study areas. The results indicated that Pb, Cr, Cu and Mn come from different sources than Cd, As and Ni. The results obtained from the metal pollution assessment using the Pollution Index (PI) and the Geoaccumulation Index (Igeo) confirmed that soils in the mining areas were contaminated in the range from moderately through strongly to highly contaminated soils. This study verified that soil contamination in the gold mining areas results from natural and anthropogenic processes. The current study findings would enhance our knowledge regarding the soil contamination level in the mining areas and the source of contamination. It is recommended to use PCA, CA, PI and Igeo to assess and monitor the heavy metal contaminated soil in gold mining areas.
基金Shaanxi Natural Science Fundamental Research Foundation(2011JM1019)
文摘Based on 10 years of statistics concerning economic development in Xi'an as the main part of Guanzhong- Tianshui Economic Zone, this article builds the main indicator system to reflect economic development. Using two mathematical methods( principal component analysis and cluster analysis),we carry out comprehensive evaluation analysis of the main economic indicators,point out the distribution differences in the economic development level in this region,and make classification,in order to provide a scientific basis for the decision- making body to lay down the relevant economic development strategies in accordance with the economic development level and geographical location.
文摘This study examined public attitudes concerning the value of outdoor spaces which people use daily. Two successive analyses were performed based on data from common residents and college students in the city of Hangzhou, China. First, citizens registered various items constituting desirable values of residential outdoor spaces through a preliminary questionnaire. The result proposed three general attributes (functional, aesthetic and ecological) and ten specific qualities of residential outdoor spaces. An analytic hierarchy process (AHP) was applied to an interview survey in order to clarify the weights among these attributes and qualities. Second, principal factors were extracted from the ten specific qualities with principal component analysis (PCA) for both the common case and the campus case. In addition, the variations of respondents’ groups were classified with cluster analysis (CA) using the results of the PCA. The results of the AHP application found that the public prefers the functional attribute, rather than the aesthetic attribute. The latter is always viewed as the core value of open spaces in the eyes of architects and designers. Fur-thermore, comparisons of ten specific qualities showed that the public prefers the open spaces that can be utilized conveniently and easily for group activities, because such spaces keep an active lifestyle of neighborhood communication, which is also seen to protect human-regarding residential environments. Moreover, different groups of respondents diverge largely in terms of gender, age, behavior and preference.
基金Supported by Major Science and Technology Project in Yunnan Province(2018ZG009).
文摘With 16 Yunnan tea tree varieties and 5 Kenya tea tree varieties as test materials,the differences in biochemical components between Yunnan and Kenya tea tree varieties were compared and analyzed.The results showed that the coefficients of variation of tea polyphenols,amino acids,caffeine,water extract,gallic acid(GA),catechin(C),epicatechin(EC),epicatechin gallate(ECG),epigallocatechin(EGC),epigallocatechin gallate(EGCG)and total catechins in Yunnan tea tree varieties were greater than those in Kenyan tea trees.The contents of tea polyphenols,amino acids,caffeine,water extract,C,EC,EGC,EGCG and total catechins in Yunnan tea tree varieties had no significant differences from those in Kenyan tea trees varieties(P>0.05),while there were significant differences in the contents of GA and ECG between Yunnan tea tree varieties and Kenya tea tree varieties(P<0.05).Therefore,it could be predicted that GA and ECG might be one of the main characteristics of the differences in biochemical components between Yunnan tea tree varieties and Kenyan tea tree varieties.The cluster analysis results showed that when the genetic distance was 15,the 21 tested tea varieties could be divided into three groups with obvious biochemical differences.
文摘With the development of power grid, as one of the key equipment, the transformer’s condition assessment method has always receive attention from experts, scholars concern more and more about the method’s practicality and reliability. In the traditional condition assessment method, due to the characteristics of the transformer’s complex structure, the assessment system is not comprehensive enough, or the assessment system is too complex, the indexes are not easy to quantify, such problems are emerging. The traditional method is complex and the degree of quantification is not enough. Therefore it is necessary to propose a condition assessment method that is easy to carry out the condition assessment work and does not affect the assessment results. In this paper, we propose a method to assess the state of the transformer’s complex structure. First, we establish a comprehensive assessment system, then apply the method of principal component analysis to optimize the index system, and then use the theory of cloud-matter-element. Finally the reliability and rationality of the method are verified by an example.
文摘The accurate extraction and classification of leather defects is an important guarantee for the automation and quality evaluation of leather industry. Aiming at the problem of data classification of leather defects,a hierarchical classification for defects is proposed.Firstly,samples are collected according to the method of minimum rectangle,and defects are extracted by image processing method.According to the geometric features of representation, they are divided into dot,line and surface for rough classification. From analysing the data which extracting the defects of geometry,gray and texture,the dominating characteristics can be acquired. Each type of defect by choosing different and representative characteristics,reducing the dimension of the data,and through these characteristics of clustering to achieve convergence effectively,realize extracted accurately,and digitized the defect characteristics,eventually establish the database. The results showthat this method can achieve more than 90% accuracy and greatly improve the accuracy of classification.
基金Climbing Peak Discipline Project of Shanghai Dianji University,China(No.15DFXK02)Hi-Tech Research and Development Programs of China(No.2007AA041600)
文摘Dimensionality reduction techniques play an important role in data mining. Kernel entropy component analysis( KECA) is a newly developed method for data transformation and dimensionality reduction. This paper conducted a comparative study of KECA with other five dimensionality reduction methods,principal component analysis( PCA),kernel PCA( KPCA),locally linear embedding( LLE),laplacian eigenmaps( LAE) and diffusion maps( DM). Three quality assessment criteria, local continuity meta-criterion( LCMC),trustworthiness and continuity measure(T&C),and mean relative rank error( MRRE) are applied as direct performance indexes to assess those dimensionality reduction methods. Moreover,the clustering accuracy is used as an indirect performance index to evaluate the quality of the representative data gotten by those methods. The comparisons are performed on six datasets and the results are analyzed by Friedman test with the corresponding post-hoc tests. The results indicate that KECA shows an excellent performance in both quality assessment criteria and clustering accuracy assessing.
文摘Glass is the precious material evidence of the trade of the early Silk Road. The ancient glass was easily affected by the environmental impact and weathering, and the change of composition ratios affected the correct judgment of its category. In this paper, mathematical models and methods such as Chi-square test, weighted average method, principal component analysis, cluster analysis, binary classification model and grey correlation analysis were used comprehensively to analyze the data of sample glass products combined with their categories. The results showed that the weathered high-potassium glass could be divided into 12, 9, 10 and 27, 7, 22 and so on.