In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared...In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes.展开更多
Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the S...Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare these methods. We offer the correct syntax to deactivate the similarity algorithm for clustering analysis within the hierarchical clustering module of SPSS. Findings: When one inputs co-occurrence matrices into the data editor of the SPSS hierarchical clustering module without deactivating the embedded similarity algorithm, the program calculates similarity twice, and thus distorts and overestimates the degree of similarity. Practical implications: We offer the correct syntax to block the similarity algorithm for clustering analysis in the SPSS hierarchical clustering module in the case of co-occurrence matrices. This syntax enables researchers to avoid obtaining incorrect results. Originality/value: This paper presents a method of editing syntax to prevent the default use of a similarity algorithm for SPSS's hierarchical clustering module. This will help researchers, especially those from China, to properly implement the co-occurrence matrix when using SPSS for hierarchical cluster analysis, in order to provide more scientific and rational results.展开更多
[Objectives]To explore the compatibility rules of neonatal parenteral nutrition(PN)prescriptions based on association rules and hierarchical cluster analysis,thereby providing a reference for standardizing neonatal pa...[Objectives]To explore the compatibility rules of neonatal parenteral nutrition(PN)prescriptions based on association rules and hierarchical cluster analysis,thereby providing a reference for standardizing neonatal parenteral nutrition supportive therapy.[Methods]The data about neonatal PN formulations prepared by the Pharmacy Intravenous Admixture Services(PIVAS)of the Affiliated Hospital of Chengde Medical University from July 2015 to June 2021 were collected.The general information of the prescriptions and the frequency of drug use were analyzed with Excel 2019;the boxplot of drug dosing was drawn using GraphPad 8.0 software;and SPSS Modeler 18.0 and SPSS Statistics 26.0 were used to perform association rules and hierarchical cluster analysis.[Results]A total of 11488 PN prescriptions were collected from 1421 newborns,involving 18 kinds of drugs,which were divided into 11 types of nutrients.Association rules analysis yielded 84 nutrient substance combinations.The combination of fat emulsion-water-soluble vitamins-fat-soluble vitamins-glucose-amino acids had the highest confidence(99.95%).The hierarchical cluster analysis divided nutrients into 5 types.[Conclusions]The prescriptions of PN for newborns were composed of five types of nutrients:amino acids,fat emulsion,glucose,water-soluble vitamins,and fat-soluble vitamins.According to the lack of electrolytes and trace elements,appropriate drugs can be chosen to meet nutritional demands.This study provides reference basis for reasonable selection of drugs for neonatal PN prescriptions and further standardization of PN supportive therapy in newborns.展开更多
[Objective] This research aimed to study the FTIR spectra of corn germs and endosperms so as to provide a scientific way for identifying corn of different types. [Method] The corn germs and endosperms of three types w...[Objective] This research aimed to study the FTIR spectra of corn germs and endosperms so as to provide a scientific way for identifying corn of different types. [Method] The corn germs and endosperms of three types were studied by using Fourier transform infrared spectroscopy(FTIR) technology, combined with cluster analysis. [Result] The overall characteristics of original FTIR spectra were basically similar within the range of 700-1 800 cm^-1. The FTIR spectra were mainly composed by the absorption peaks of polysaccharides, proteins and lipids. Within the wavelength range of 700-1 800 cm^-1, there were only tiny differences in original FTIR spectra among the corn germs and endosperms of three different types. The spectra were then processed by using first derivative and second derivative. The second derivative spectra were used for hierarchical cluster analysis(HCA). The results showed that with the wavelength range of 700-1 800 cm^-1, the second derivative spectra of the 52 samples could be better clustered according to the tree types and corn germ and corn endosperm. The clustering correct rate reached 96.1%.[Conclusion] FTIR technology, combined with cluster analysis, can be used to identify different types of corn germs and endosperms, and it is characterized by convenience and rapidness.展开更多
The fruits of leguminous plants Cercis Chinensis Bunge are still overlooked although they have been reported to be antioxidative because of the limited information on the phytochemicals of C.chinensis fruits.A simple,...The fruits of leguminous plants Cercis Chinensis Bunge are still overlooked although they have been reported to be antioxidative because of the limited information on the phytochemicals of C.chinensis fruits.A simple,rapid and sensitive HPLC-MS/MS method was developed for the identification and quantitation of the major bioactive components in C.chinensis fruits.Eighteen polyphenols were identified,which are first reported in C.chinensis fruits.Moreover,ten components were simultaneously quantified.The validated quantitative method was proved to be sensitive,reproducible and accurate.Then,it was applied to analyze batches of C.chinensis fruits from different phytomorph and areas.The principal components analysis(PCA)realized visualization and reduction of data set dimension while the hierarchical cluster analysis(HCA)indicated that the content of phenolic acids or all ten components might be used to differentiate C.chinensis fruits of different phytomorph.展开更多
The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allo...The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allows clustering variable objects into groups-clusters on the basis of similarity or dissimilarity. Cluster analysis involves computational procedures, of which purpose is to reduce a set of data on several relatively homogenous groups-clusters, while the condition of reduction is maximal and simultaneously minimal similarity of clusters. Similarity of objects is studied by the degree of similarity (correlation coefficient and association coefficient) or the degree of dissimilarity-degree of distance (distance coefficient). Methods of cluster analysis are on the basis of clustering classified as hierarchical or non-hierarchical methods.展开更多
Social networking sites in the most modernized world are flooded with large data volumes.Extracting the sentiment polarity of important aspects is necessary;as it helps to determine people’s opinions through what the...Social networking sites in the most modernized world are flooded with large data volumes.Extracting the sentiment polarity of important aspects is necessary;as it helps to determine people’s opinions through what they write.The Coronavirus pandemic has invaded the world and been given a mention in the social media on a large scale.In a very short period of time,tweets indicate unpredicted increase of coronavirus.They reflect people’s opinions and thoughts with regard to coronavirus and its impact on society.The research community has been interested in discovering the hidden relationships from short texts such as Twitter and Weiboa;due to their shortness and sparsity.In this paper,a hierarchical twitter sentiment model(HTSM)is proposed to show people’s opinions in short texts.The proposed HTSM has two main features as follows:constructing a hierarchical tree of important aspects from short texts without a predefined hierarchy depth and width,as well as analyzing the extracted opinions to discover the sentiment polarity on those important aspects by applying a valence aware dictionary for sentiment reasoner(VADER)sentiment analysis.The tweets for each extracted important aspect can be categorized as follows:strongly positive,positive,neutral,strongly negative,or negative.The quality of the proposed model is validated by applying it to a popular product and a widespread topic.The results show that the proposed model outperforms the state-of-the-art methods used in analyzing people’s opinions in short text effectively.展开更多
The problem of taking a set of data and separating it into subgroups where the elements of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied throug...The problem of taking a set of data and separating it into subgroups where the elements of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied through the statistical method of cluster analysis. In this paper we want to discuss the application of this method to the field of education: particularly, we want to present the use of cluster analysis to separate students into groups that can be recognized and characterized by common traits in their answers to a questionnaire, without any prior knowledge of what form those groups would take (unsupervised classification). We start from a detailed study of the data processing needed by cluster analysis. Then two methods commonly used in cluster analysis are before described only from a theoretical point a view and after in the Section 4 through an example of application to data coming from an open-ended questionnaire administered to a sample of university students. In particular we describe and criticize the variables and parameters used to show the results of the cluster analysis methods.展开更多
A genetic algorithm-based joint inversion method is presented for evaluating hydrocarbon-bearing geological formations. Conventional inversion procedures routinely used in the oil industry perform the inversion proces...A genetic algorithm-based joint inversion method is presented for evaluating hydrocarbon-bearing geological formations. Conventional inversion procedures routinely used in the oil industry perform the inversion processing of borehole geophysical data locally. As having barely more types of data than unknowns in a depth, a set of marginally over-determined inverse problems has to be solved along a borehole, which is a rather noise sensitive procedure. For the reduction of noise effect, the amount of overdetermination must be increased. To fulfill this requirement, we suggest the use of our interval inversion method, which inverts simultaneously all data from a greater depth interval to estimate petrophysical parameters of reservoirs to the same interval. A series expansion based discretization scheme ensures much more data against unknowns that significantly reduces the estimation error of model parameters. The knowledge of reservoir boundaries is also required for reserve calculation. Well logs contain information about layer-thicknesses, but they cannot be extracted by the local inversion approach. We showed earlier that the depth coordinates of layerboundaries can be determined within the interval inversion procedure. The weakness of method is that the output of inversion is highly influenced by arbitrary assumptions made for layer-thicknesses when creating a starting model (i.e. number of layers, search domain of thicknesses). In this study, we apply an automated procedure for the determination of rock interfaces. We perform multidimensional hierarchical cluster analysis on well-logging data before inversion that separates the measuring points of different layers on a lithological basis. As a result, the vertical distribution of clusters furnishes the coordinates of layer-boundaries, which are then used as initial model parameters for the interval inversion procedure. The improved inversion method gives a fast, automatic and objective estimation to layer-boundaries and petrophysical parameters, which is demonstrated by a hydrocarbon field example.展开更多
A fuzzy clustering analysis model based on the quotient space is proposed. Firstly, the conversion from coarse to fine granularity and the hierarchical structure are used to reduce the multidimensional samples. Second...A fuzzy clustering analysis model based on the quotient space is proposed. Firstly, the conversion from coarse to fine granularity and the hierarchical structure are used to reduce the multidimensional samples. Secondly, the fuzzy compatibility relation matrix of the model is converted into fuzzy equivalence relation matrix. Finally, the diagram of clustering genealogy is generated according to the fuzzy equivalence relation matrix, which enables the dynamic selection of different thresholds to effectively solve the problem of cluster analysis of the samples with multi-dimensional attributes.展开更多
Density functional theory (DFT) was used to calculate molecular descriptors (properties) for 12 fluoro-quinolone with anti-S.pneumoniae activity. Principal component analysis (PCA) and hierarchical cluster analy...Density functional theory (DFT) was used to calculate molecular descriptors (properties) for 12 fluoro-quinolone with anti-S.pneumoniae activity. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) were employed to reduce dimensionality and investigate in which variables should be more effective for classifying fluoroquinolones according to their degree of an-S.pneumoniae activity. The PCA results showed that the variables ELUMO, Q3, Q5, QA, logP, MR, VOL and △EHL of these compounds were responsible for the anti-S.pneumoniae activity. The HCA results were similar to those obtained with PCA.The methodologies of PCA and HCA provide a reliable rule for classifying new fluoroquinolones with antiS.pneumoniae activity. By using the chemometric results, 6 synthetic compounds were analyzed through the PCA and HCA and two of them are proposed as active molecules with anti-S.pneumoniae, which is consistent with the results of clinic experiments.展开更多
In order to distinguish 8 kinds of rhizome crops, the 40 samples were studied by Fourier transform infrared spectroscopy (FTIR) combined with wavelet transform (WT), principal component analysis (PCA) and hieram...In order to distinguish 8 kinds of rhizome crops, the 40 samples were studied by Fourier transform infrared spectroscopy (FTIR) combined with wavelet transform (WT), principal component analysis (PCA) and hieramhical cluster analysis (HCA). The results showed that the infrared spectra were similar on the whole, but there were differences in peak position, peak shape and peak absorption intensity in the range of 1 800-700 cm-1. The infrared spectra in the range of 1 800-700 cm-1 were selected to perform continuous wavelet transform (CWT) and discrete wavelet transform (DWT). The 15th-Ievel decomposition coefficients of CWT and the 5=-level detail coefficients of DWT were classified by PCA and HCA. The cumulative contri- bution rates of the first three principal components of CWT and DWT were 93.12% and 89.78%, respectively. The accurate recognition rates of PCA and HCA were all 100%. It is proved that FTIR combined with WT can be used to distinguish different kinds of rhizome crops.展开更多
Hierarchical clustering analysis and principal component analysis (PCA) methods were used to assess the similarities and dissimilarities of the entire Excitation-emission matrix spectroscopy (EEMs) data sets of sa...Hierarchical clustering analysis and principal component analysis (PCA) methods were used to assess the similarities and dissimilarities of the entire Excitation-emission matrix spectroscopy (EEMs) data sets of samples collected from Jiaozhou Bay, China. The results demonstrate that multivariate analysis facilitates the complex data treatment and spectral sorting processes, and also enhances the probability to reveal otherwise hidden information concerning the chemical characteristics of the dissolved organic matter (DOM). The distribution of different water samples as revealed by multivariate results has been used to track the movement of DOM material in the study area, and the interpretation is supported by the results obtained from the numerical simulation model of substance tracing technique, which show that the substance discharged by Haibo River can be distributed in Jiaozhou Bay.展开更多
基金This work was supported by Science and Technology Research Program of Chongqing Municipal Education Commission(KJZD-M202300502,KJQN201800539).
文摘In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes.
文摘Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare these methods. We offer the correct syntax to deactivate the similarity algorithm for clustering analysis within the hierarchical clustering module of SPSS. Findings: When one inputs co-occurrence matrices into the data editor of the SPSS hierarchical clustering module without deactivating the embedded similarity algorithm, the program calculates similarity twice, and thus distorts and overestimates the degree of similarity. Practical implications: We offer the correct syntax to block the similarity algorithm for clustering analysis in the SPSS hierarchical clustering module in the case of co-occurrence matrices. This syntax enables researchers to avoid obtaining incorrect results. Originality/value: This paper presents a method of editing syntax to prevent the default use of a similarity algorithm for SPSS's hierarchical clustering module. This will help researchers, especially those from China, to properly implement the co-occurrence matrix when using SPSS for hierarchical cluster analysis, in order to provide more scientific and rational results.
基金Supported by Science and Technology Research and Development Project of Chengde City,Hebei Province(201706A043)Young Scholar Program of Hebei Pharmaceutical Association Hospital Pharmaceutical Research Project(2020—Hbsyxhqn0029).
文摘[Objectives]To explore the compatibility rules of neonatal parenteral nutrition(PN)prescriptions based on association rules and hierarchical cluster analysis,thereby providing a reference for standardizing neonatal parenteral nutrition supportive therapy.[Methods]The data about neonatal PN formulations prepared by the Pharmacy Intravenous Admixture Services(PIVAS)of the Affiliated Hospital of Chengde Medical University from July 2015 to June 2021 were collected.The general information of the prescriptions and the frequency of drug use were analyzed with Excel 2019;the boxplot of drug dosing was drawn using GraphPad 8.0 software;and SPSS Modeler 18.0 and SPSS Statistics 26.0 were used to perform association rules and hierarchical cluster analysis.[Results]A total of 11488 PN prescriptions were collected from 1421 newborns,involving 18 kinds of drugs,which were divided into 11 types of nutrients.Association rules analysis yielded 84 nutrient substance combinations.The combination of fat emulsion-water-soluble vitamins-fat-soluble vitamins-glucose-amino acids had the highest confidence(99.95%).The hierarchical cluster analysis divided nutrients into 5 types.[Conclusions]The prescriptions of PN for newborns were composed of five types of nutrients:amino acids,fat emulsion,glucose,water-soluble vitamins,and fat-soluble vitamins.According to the lack of electrolytes and trace elements,appropriate drugs can be chosen to meet nutritional demands.This study provides reference basis for reasonable selection of drugs for neonatal PN prescriptions and further standardization of PN supportive therapy in newborns.
基金Supported by National Natural Science Foundation of China(30960179)Natural Science Foundation of Yunnan Province(2007A048M)~~
文摘[Objective] This research aimed to study the FTIR spectra of corn germs and endosperms so as to provide a scientific way for identifying corn of different types. [Method] The corn germs and endosperms of three types were studied by using Fourier transform infrared spectroscopy(FTIR) technology, combined with cluster analysis. [Result] The overall characteristics of original FTIR spectra were basically similar within the range of 700-1 800 cm^-1. The FTIR spectra were mainly composed by the absorption peaks of polysaccharides, proteins and lipids. Within the wavelength range of 700-1 800 cm^-1, there were only tiny differences in original FTIR spectra among the corn germs and endosperms of three different types. The spectra were then processed by using first derivative and second derivative. The second derivative spectra were used for hierarchical cluster analysis(HCA). The results showed that with the wavelength range of 700-1 800 cm^-1, the second derivative spectra of the 52 samples could be better clustered according to the tree types and corn germ and corn endosperm. The clustering correct rate reached 96.1%.[Conclusion] FTIR technology, combined with cluster analysis, can be used to identify different types of corn germs and endosperms, and it is characterized by convenience and rapidness.
基金supported by the National Natural Science Foundation of China(Grant Nos.82073808,81872828,and 81573384)。
文摘The fruits of leguminous plants Cercis Chinensis Bunge are still overlooked although they have been reported to be antioxidative because of the limited information on the phytochemicals of C.chinensis fruits.A simple,rapid and sensitive HPLC-MS/MS method was developed for the identification and quantitation of the major bioactive components in C.chinensis fruits.Eighteen polyphenols were identified,which are first reported in C.chinensis fruits.Moreover,ten components were simultaneously quantified.The validated quantitative method was proved to be sensitive,reproducible and accurate.Then,it was applied to analyze batches of C.chinensis fruits from different phytomorph and areas.The principal components analysis(PCA)realized visualization and reduction of data set dimension while the hierarchical cluster analysis(HCA)indicated that the content of phenolic acids or all ten components might be used to differentiate C.chinensis fruits of different phytomorph.
文摘The paper deals with cluster analysis and comparison of clustering methods. Cluster analysis belongs to multivariate statistical methods. Cluster analysis is defined as general logical technique, procedure, which allows clustering variable objects into groups-clusters on the basis of similarity or dissimilarity. Cluster analysis involves computational procedures, of which purpose is to reduce a set of data on several relatively homogenous groups-clusters, while the condition of reduction is maximal and simultaneously minimal similarity of clusters. Similarity of objects is studied by the degree of similarity (correlation coefficient and association coefficient) or the degree of dissimilarity-degree of distance (distance coefficient). Methods of cluster analysis are on the basis of clustering classified as hierarchical or non-hierarchical methods.
基金This research was supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)and the Soonchunhyang University Research Fund.
文摘Social networking sites in the most modernized world are flooded with large data volumes.Extracting the sentiment polarity of important aspects is necessary;as it helps to determine people’s opinions through what they write.The Coronavirus pandemic has invaded the world and been given a mention in the social media on a large scale.In a very short period of time,tweets indicate unpredicted increase of coronavirus.They reflect people’s opinions and thoughts with regard to coronavirus and its impact on society.The research community has been interested in discovering the hidden relationships from short texts such as Twitter and Weiboa;due to their shortness and sparsity.In this paper,a hierarchical twitter sentiment model(HTSM)is proposed to show people’s opinions in short texts.The proposed HTSM has two main features as follows:constructing a hierarchical tree of important aspects from short texts without a predefined hierarchy depth and width,as well as analyzing the extracted opinions to discover the sentiment polarity on those important aspects by applying a valence aware dictionary for sentiment reasoner(VADER)sentiment analysis.The tweets for each extracted important aspect can be categorized as follows:strongly positive,positive,neutral,strongly negative,or negative.The quality of the proposed model is validated by applying it to a popular product and a widespread topic.The results show that the proposed model outperforms the state-of-the-art methods used in analyzing people’s opinions in short text effectively.
文摘The problem of taking a set of data and separating it into subgroups where the elements of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied through the statistical method of cluster analysis. In this paper we want to discuss the application of this method to the field of education: particularly, we want to present the use of cluster analysis to separate students into groups that can be recognized and characterized by common traits in their answers to a questionnaire, without any prior knowledge of what form those groups would take (unsupervised classification). We start from a detailed study of the data processing needed by cluster analysis. Then two methods commonly used in cluster analysis are before described only from a theoretical point a view and after in the Section 4 through an example of application to data coming from an open-ended questionnaire administered to a sample of university students. In particular we describe and criticize the variables and parameters used to show the results of the cluster analysis methods.
文摘A genetic algorithm-based joint inversion method is presented for evaluating hydrocarbon-bearing geological formations. Conventional inversion procedures routinely used in the oil industry perform the inversion processing of borehole geophysical data locally. As having barely more types of data than unknowns in a depth, a set of marginally over-determined inverse problems has to be solved along a borehole, which is a rather noise sensitive procedure. For the reduction of noise effect, the amount of overdetermination must be increased. To fulfill this requirement, we suggest the use of our interval inversion method, which inverts simultaneously all data from a greater depth interval to estimate petrophysical parameters of reservoirs to the same interval. A series expansion based discretization scheme ensures much more data against unknowns that significantly reduces the estimation error of model parameters. The knowledge of reservoir boundaries is also required for reserve calculation. Well logs contain information about layer-thicknesses, but they cannot be extracted by the local inversion approach. We showed earlier that the depth coordinates of layerboundaries can be determined within the interval inversion procedure. The weakness of method is that the output of inversion is highly influenced by arbitrary assumptions made for layer-thicknesses when creating a starting model (i.e. number of layers, search domain of thicknesses). In this study, we apply an automated procedure for the determination of rock interfaces. We perform multidimensional hierarchical cluster analysis on well-logging data before inversion that separates the measuring points of different layers on a lithological basis. As a result, the vertical distribution of clusters furnishes the coordinates of layer-boundaries, which are then used as initial model parameters for the interval inversion procedure. The improved inversion method gives a fast, automatic and objective estimation to layer-boundaries and petrophysical parameters, which is demonstrated by a hydrocarbon field example.
文摘A fuzzy clustering analysis model based on the quotient space is proposed. Firstly, the conversion from coarse to fine granularity and the hierarchical structure are used to reduce the multidimensional samples. Secondly, the fuzzy compatibility relation matrix of the model is converted into fuzzy equivalence relation matrix. Finally, the diagram of clustering genealogy is generated according to the fuzzy equivalence relation matrix, which enables the dynamic selection of different thresholds to effectively solve the problem of cluster analysis of the samples with multi-dimensional attributes.
基金This work was supported by National Nature Science Foundation of China and China Academy of Engineering Physics (No. 10376021) Provincial National Science Foundation of He'nan (No. 2004601107).
文摘Density functional theory (DFT) was used to calculate molecular descriptors (properties) for 12 fluoro-quinolone with anti-S.pneumoniae activity. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) were employed to reduce dimensionality and investigate in which variables should be more effective for classifying fluoroquinolones according to their degree of an-S.pneumoniae activity. The PCA results showed that the variables ELUMO, Q3, Q5, QA, logP, MR, VOL and △EHL of these compounds were responsible for the anti-S.pneumoniae activity. The HCA results were similar to those obtained with PCA.The methodologies of PCA and HCA provide a reliable rule for classifying new fluoroquinolones with antiS.pneumoniae activity. By using the chemometric results, 6 synthetic compounds were analyzed through the PCA and HCA and two of them are proposed as active molecules with anti-S.pneumoniae, which is consistent with the results of clinic experiments.
基金Supported by National Natural Science Foundation of China(30960179)Program for Innovative Research Team in Science and Technology in University of Yunnan Province~~
文摘In order to distinguish 8 kinds of rhizome crops, the 40 samples were studied by Fourier transform infrared spectroscopy (FTIR) combined with wavelet transform (WT), principal component analysis (PCA) and hieramhical cluster analysis (HCA). The results showed that the infrared spectra were similar on the whole, but there were differences in peak position, peak shape and peak absorption intensity in the range of 1 800-700 cm-1. The infrared spectra in the range of 1 800-700 cm-1 were selected to perform continuous wavelet transform (CWT) and discrete wavelet transform (DWT). The 15th-Ievel decomposition coefficients of CWT and the 5=-level detail coefficients of DWT were classified by PCA and HCA. The cumulative contri- bution rates of the first three principal components of CWT and DWT were 93.12% and 89.78%, respectively. The accurate recognition rates of PCA and HCA were all 100%. It is proved that FTIR combined with WT can be used to distinguish different kinds of rhizome crops.
基金supported by the National High-tech Research Project ("863" Project) of China under contract Nos 2003AA635180 and 2006AA09Z167the Public Welfare Project of Marine Science Research under contract No 200705011the open project of Key Laboratory of Integrated Marine Monitoring and Applied Technologies for Harmful Algal Blooms,SOA, China under contract No200811
文摘Hierarchical clustering analysis and principal component analysis (PCA) methods were used to assess the similarities and dissimilarities of the entire Excitation-emission matrix spectroscopy (EEMs) data sets of samples collected from Jiaozhou Bay, China. The results demonstrate that multivariate analysis facilitates the complex data treatment and spectral sorting processes, and also enhances the probability to reveal otherwise hidden information concerning the chemical characteristics of the dissolved organic matter (DOM). The distribution of different water samples as revealed by multivariate results has been used to track the movement of DOM material in the study area, and the interpretation is supported by the results obtained from the numerical simulation model of substance tracing technique, which show that the substance discharged by Haibo River can be distributed in Jiaozhou Bay.