Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in soil properties. However, there is not enough research quantifying which environmental factor(s) can be the most re...Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in soil properties. However, there is not enough research quantifying which environmental factor(s) can be the most relevant to predicting soil properties at the catchment scale in semi-arid areas. Thus, this research aims to investigate the ability of multivariate statistical analyses to distinguish which soil properties follow a clear spatial pattern conditioned by specific environmental characteristics in a semi-arid region of Iran. To achieve this goal, we digitized parent materials and landforms by recent orthophotography. Also, we extracted ten topographical attributes and five remote sensing variables from a digital elevation model(DEM) and the Landsat Enhanced Thematic Mapper(ETM), respectively. These factors were contrasted for 334 soil samples(depth of 0–30 cm). Cluster analysis and soil maps reveal that Cluster 1 comprises of limestones, massive limestones and mixed deposits of conglomerates with low soil organic carbon(SOC) and clay contents, and Cluster 2 is composed of soils that originated from quaternary and early quaternary parent materials such as terraces, alluvial fans, lake deposits, and marls or conglomerates that register the highest SOC content and the lowest sand and silt contents. Further, it is confirmed that soils with the highest SOC and clay contents are located in wetlands, lagoons, alluvial fans and piedmonts, while soils with the lowest SOC and clay contents are located in dissected alluvial fans, eroded hills, rock outcrops and steep hills. The results of principal component analysis using the remote sensing data and topographical attributes identify five main components, which explain 73.3% of the total variability of soil properties. Environmental factors such as hillslope morphology and all of the remote sensing variables can largely explain SOC variability, but no significant correlation is found for soil texture and calcium carbonate equivalent contents. Therefore, we conclude that SOC can be considered as the best-predicted soil property in semi-arid regions.展开更多
Dongguan (东莞) City, located in the Pearl River Delta, South China, is famous for its rapid industrialization in the past 30 years. A total of 90 topsoil samples have been collected from agricultural fields, includ...Dongguan (东莞) City, located in the Pearl River Delta, South China, is famous for its rapid industrialization in the past 30 years. A total of 90 topsoil samples have been collected from agricultural fields, including vegetable and orchard soils in the city, and eight heavy metals (As, Cu, Cd, Cr, Hg, Ni, Pb, and Zn) and other items (pH values and organic matter) have been analyzed, to evaluate the influence of anthropic activities on the environmental quality of agricultural soils and to identify the spatial distribution of trace elements and possible sources of trace elements. The elements Hg, Pb, and Cd have accumulated remarkably here, incomparison with the soil background content of elements in Guangdong (广东) Province. Pollution is more serious in the western plain and the central region, which are heavily distributed with industries and rivers. Multivariate and geostatistical methods have been applied to differentiate the influences of natural processes and human activities on the pollution of heavy metals in topsoils in the study area. The results of cluster analysis (CA) and factor analysis (FA) show that Ni, Cr, Cu, Zn, and As are grouped in factor F1, Pb in F2, and Cd and Hg in F3, respectively. The spatial pattern of the three factors may be well demonstrated by geostatistical analysis. It is shown that the first factor could be considered as a natural source controlled by parent rocks. The second factor could be referred to as "industrial and traffic pollution sources". The source of the third factor is mainly controlled by long-term anthropic activities, as a consequence of agricultural activities, fossil fuel consumption, and atmospheric deposition.展开更多
Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the t...Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.展开更多
Meretricis concha is a kind of marine traditional Chinese medicine(TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental...Meretricis concha is a kind of marine traditional Chinese medicine(TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental fingerprint and the geographical origin identification of Meretricis concha, the elemental contents of M. concha from five sampling points in Rushan Bay have been determined by means of inductively coupled plasma optical emission spectrometry(ICP-OES). Based on the contents of 14 inorganic elements(Al, As, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Se, and Zn), the inorganic elemental fingerprint which well reflects the elemental characteristics was constructed. All the data from the five sampling points were discriminated with accuracy through hierarchical cluster analysis(HCA) and principle component analysis(PCA), indicating that a four-factor model which could explain approximately 80% of the detection data was established, and the elements Al, As, Cd, Cu, Ni and Pb could be viewed as the characteristic elements. This investigation suggests that the inorganic elemental fingerprint combined with multivariate statistical analysis is a promising method for verifying the geographical origin of M. concha, and this strategy should be valuable for the authenticity discrimination of some marine TCM.展开更多
Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares ...Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares (PLS) are surveyed in this paper. The four-step procedure of performing MSPM&C for chemical process, modeling of processes, detecting abnormal events or faults, identifying the variable(s) responsible for the faults and diagnosing the source cause for the abnormal behavior, is analyzed. Several main research directions of MSPM&C reported in the literature are discussed, such as multi-way principal component analysis (MPCA) for batch process, statistical monitoring and control for nonlinear process, dynamic PCA and dynamic PLS, and on-line quality control by inferential models. Industrial applications of MSPM&C to several typical chemical processes, such as chemical reactor, distillation column, polymerization process, petroleum refinery units, are summarized. Finally, some concluding remarks and future considerations are made.展开更多
A technique for estimating tropical cyclone(TC) intensity over the Western North Pacific utilizing FY-3Microwave Imager(MWRI) data is developed. As a first step, we investigated the relationship between the FY-3 MWRI ...A technique for estimating tropical cyclone(TC) intensity over the Western North Pacific utilizing FY-3Microwave Imager(MWRI) data is developed. As a first step, we investigated the relationship between the FY-3 MWRI brightness temperature(TB) parameters, which are computed in concentric circles or annuli of different radius in different MWRI frequencies, and the TC maximum wind speed(Vmax) from the TC best track data. We found that the parameters of lower frequency channels' minimum TB, mean TB and ratio of pixels over the threshold TB with a radius of 1.0 or 1.5 degrees from the center give higher correlation. Then by applying principal components analysis(PCA)and multiple regression method, we established an estimation model and evaluated it using independent verification data, with the RMSE being 13 kt. The estimated Vmax is always stronger in the early stages of development, but slightly weaker toward the mature stage, and a reversal of positive and negative bias takes place with a boundary of around 70 kt. For the TC that has a larger error, we found that they are often with less organized and asymmetric cloud pattern, so the classification of TC cloud pattern will help improve the acuracy of the estimated TC intensity, and with the increase of statistical samples the accuracy of the estimated TC intensity will also be improved.展开更多
Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality dat...Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River(SSHR) basin in China,obtained during two years(2012-2013) of monitoring of 10 physicochemical parameters at 15 different sites.The results showed that most of physicochemical parameters varied significantly among the sampling sites.Three significant groups,highly polluted(HP),moderately polluted(MP) and less polluted(LP),of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics.DA identified p H,F,DO,NH3-N,COD and VPhs were the most important parameters contributing to spatial variations of surface water quality.However,DA did not give a considerable data reduction(40% reduction).PCA/FA resulted in three,three and four latent factors explaining 70%,62% and 71% of the total variance in water quality data sets of HP,MP and LP regions,respectively.FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities(point sources:industrial effluents and wastewater treatment plants;non-point sources:domestic sewage,livestock operations and agricultural activities) and natural processes(seasonal effect,and natural inputs).PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters(about 80% reduction) as the most important parameters to explain 72% of the data variation.Thus,this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and,in water quality assessment,identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.展开更多
Groundwater is considered as one of the most important sources for water supply in Iran.The Fasa Plain in Fars Province,Southern Iran is one of the major areas of wheat production using groundwater for irrigation.A la...Groundwater is considered as one of the most important sources for water supply in Iran.The Fasa Plain in Fars Province,Southern Iran is one of the major areas of wheat production using groundwater for irrigation.A large population also uses local groundwater for drinking purposes.Therefore,in this study,this plain was selected to assess the spatial variability of groundwater quality and also to identify main parameters affecting the water quality using multivariate statistical techniques such as Cluster Analysis(CA),Discriminant Analysis(DA),and Principal Component Analysis(PCA).Water quality data was monitored at 22 different wells,for five years(2009-2014)with 10 water quality parameters.By using cluster analysis,the sampling wells were grouped into two clusters with distinct water qualities at different locations.The Lasso Discriminant Analysis(LDA)technique was used to assess the spatial variability of water quality.Based on the results,all of the variables except sodium absorption ratio(SAR)are effective in the LDA model with all variables affording 92.80%correct assignation to discriminate between the clusters from the primary 10 variables.Principal component(PC)analysis and factor analysis reduced the complex data matrix into two main components,accounting for more than 95.93%of the total variance.The first PC contained the parameters of TH,Ca2+,and Mg2+.Therefore,the first dominant factor was hardness.In the second PC,Cl-,SAR,and Na+were the dominant parameters,which may indicate salinity.The originally acquired factors illustrate natural(existence of geological formations)and anthropogenic(improper disposal of domestic and agricultural wastes)factors which affect the groundwater quality.展开更多
A new method using discriminant analysis and control charts is proposed for monitoring multivariate process operations more reliably.Fisher discriminant analysis (FDA) is used to derive a feature discriminant direct...A new method using discriminant analysis and control charts is proposed for monitoring multivariate process operations more reliably.Fisher discriminant analysis (FDA) is used to derive a feature discriminant direction (FDD) between each normal and fault operations,and each FDD thus decided constructs the feature space of each fault operation.Individuals control charts (XmR charts) are used to monitor multivariate processes using the process data projected onto feature spaces.Upper control limit (UCL) and lower control limit (LCL) on each feature space from normal process operation are calculated for XmR charts,and are used to distinguish fault from normal.A variation trend on an XmR chart reveals the type of relevant fault operation.Applications to Tennessee Eastman simulation processes show that this proposed method can result in better monitoring performance than principal component analysis (PCA)-based methods and can better identify step type faults on XmR charts.展开更多
Surface water has become one of the most vulnerable resources on the earth due to deterioration of its quality from diverse sources of pollution. Understanding of the spatiotemporal distribution of pollutants and iden...Surface water has become one of the most vulnerable resources on the earth due to deterioration of its quality from diverse sources of pollution. Understanding of the spatiotemporal distribution of pollutants and identification of the sources in the river systems is a prerequisite for the protection and sustainable utilization of the water resources. Multivariate statistical techniques such as Principal Component Analysis (PCA) and Factor Analysis (FA) were applied in this study to investigate the temporal and spatial variations of water quality and appoint the major factors of pollution in the Shailmari River system. Water quality data for 14 physicochemical parameters from 11 monitoring sites over the year of 2014 in three sampling seasons were collected and analyzed for this study. Kruskal-Wallis test showed significant (p < 0.01) temporal and spatial variations in all of the water quality parameters of the river water. Principal component analysis (PCA) allowed extracting the contributing parameters affecting the seasonal water quality in the river system. Scatter plots of the PCs showed the tidal and spatial variation within river system and identified parameters controlling the behavior in each case. Factor analysis (FA) further reduced the data and extracted factors which are significantly responsible for water quality variation in the river. The results indicate that the parameters controlling the water quality in different seasons are related with salinity, anthropogenic pollution (sewage disposal, effluents) and agricultural runoff in pre-monsoon;precipitation induced surface runoff in monsoon;and erosion, oxidation or organic pollution (point and non-point sources) in post-monsoon. Therefore, the study reveals the applicability and usefulness of the multivariate statistical methods in assessing water quality of river by identifying the potential environmental factors controlling the water quality in different seasons which might help to better understand, monitor and manage the quality of the water resources.展开更多
Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component...Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component analysis with discriminatory analysis. Principal component analysis and discriminatory analysis are very important theories in multivariate statistical analysis that has developed quickly in the late thirty years. By means of maximization information method, we choose several earthquake prediction factors whose cumulative proportions of total sam-ple variances are beyond 90% from numerous earthquake prediction factors. The paper applies regression analysis and Mahalanobis discrimination to extrapolating synthetic prediction. Furthermore, we use this model to charac-terize and predict earthquakes in North China (30~42N, 108~125E) and better prediction results are obtained.展开更多
Heterogeneity of biological samples is usually considered a major obstacle for three-dimensional (3D) structure determination of macromolecular complexes. Heterogeneity may occur at the level of composition or conform...Heterogeneity of biological samples is usually considered a major obstacle for three-dimensional (3D) structure determination of macromolecular complexes. Heterogeneity may occur at the level of composition or conformational variability of complexes and affects most 3D structure determination methods that rely on signal averaging. Here, an approach is described that allows sorting structural states based on a 3D statistical approach, the 3D sampling and classification (3D-SC) of 3D structures derived from single particles imaged by cryo electron microscopy (cryo-EM). The method is based on jackknifing & bootstrapping of 3D sub-ensembles and 3D multivariate statistical analysis followed by 3D classification. The robustness of the statistical sorting procedure is corroborated using model data from an RNA polymerase structure and experimental data from a ribosome complex. It allows resolving multiple states within heterogeneous complexes that thus become amendable for a structural analysis despite of their highly flexible nature. The method has important implications for high-resolution structural studies and allows describing structure ensembles to provide insights into the dynamics of multi-component macromolecular assemblies.展开更多
[Objective] The study aimed to study the relationship between soil and environment on the basis of multivariate statistical analysis. [ Method] Through field investigation, sampling and laboratory analysis, we discuss...[Objective] The study aimed to study the relationship between soil and environment on the basis of multivariate statistical analysis. [ Method] Through field investigation, sampling and laboratory analysis, we discussed the relationship between soil properties and environmental factors in Mizhi County, North Shaanxi by using Canoco multivariate statistical analysis. [ Result]According to the effects of various environmental factors on soil properties, the influencing order of environmental factors was land use way 〉 vegetation type 〉 vegetation restoration years 〉 vegeta- tion coverage 〉 slope aspect 〉 gradient 〉 elevation. In a word, soil properties were significantly affected by land use way and vegetation type which were the most important environmental factors of soil properties in spatial variation, while vegetation restoration years were closely related to the ac- cumulation of soil nutrients. [ Condusion]The research could provide theoretical references for the construction of ecological environment in Loess Plateau of China.展开更多
Jinhongtang is a traditional Chinese medicine formula composed of Rheum palmatum L.stem,Sargentodoxa cuneata stem,and Taraxacum mongolicum and is used for the treatment of sepsis.However,quality assessment method for ...Jinhongtang is a traditional Chinese medicine formula composed of Rheum palmatum L.stem,Sargentodoxa cuneata stem,and Taraxacum mongolicum and is used for the treatment of sepsis.However,quality assessment method for Jinhongtang is not available.In present study,we developed a UFLC-MS/MS method to determine 16 analytes in 20 batches of home-made and commercial Jinhongtang.Multivariate statistical analysis revealed the significant differences in the quality of home-made and commercial Jinhongtang and the difference in the quality of home-made samples was more significant.The integrated strategy based on UFLC-MS/MS and multivariate statistical analysis provided a new basis for the overall quality assessment of Jinhongtang.展开更多
Biology is a challenging and complicated mess. Understanding this challenging complexity is the realm of the biological sciences: Trying to make sense of the massive, messy data in terms of discovering patterns and re...Biology is a challenging and complicated mess. Understanding this challenging complexity is the realm of the biological sciences: Trying to make sense of the massive, messy data in terms of discovering patterns and revealing its underlying general rules. Among the most powerful mathematical tools for organizing and helping to structure complex, heterogeneous and noisy data are the tools provided by multivariate statistical analysis (MSA) approaches. These eigenvector/eigenvalue data-compression approaches were first introduced to electron microscopy (EM) in 1980 to help sort out different views of macromolecules in a micrograph. After 35 years of continuous use and developments, new MSA applications are still being proposed regularly. The speed of computing has increased dramatically in the decades since their first use in electron microscopy. However, we have also seen a possibly even more rapid increase in the size and complexity of the EM data sets to be studied. MSA computations had thus become a very serious bottleneck limiting its general use. The parallelization of our programs—speeding up the process by orders of magnitude—has opened whole new avenues of research. The speed of the automatic classification in the compressed eigenvector space had also become a bottleneck which needed to be removed. In this paper we explain the basic principles of multivariate statistical eigenvector-eigenvalue data compression;we provide practical tips and application examples for those working in structural biology, and we provide the more experienced researcher in this and other fields with the formulas associated with these powerful MSA approaches.展开更多
In the present study, an ultra performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry(UPLC-QTOF/MS) based chemical profiling approach to rapidly evaluate chemical diversity after co...In the present study, an ultra performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry(UPLC-QTOF/MS) based chemical profiling approach to rapidly evaluate chemical diversity after codecocting of the combination of Aconitum carmichaeli Debx.(wu-tou in Chinese, WT) and Bletilla striata(Thunb.) Reichb.f.(bai-ji in Chinese, BJ) incompatible pair. Two different kinds of decoctions, namely WT-BJ mixed decoction: mixed water extract of each individual herbs, and WT-BJ co-decoction: water extract of mixed two constituent herbs, were prepared. Batches of these two kinds of decoction samples were subjected to UPLC-QTOF/MS analysis, the datasets of tR-m/z pairs, ion intensities and sample codes were processed with supervised orthogonal partial least squared discriminant analysis(OPLS-DA) to holistically compare the difference between these two kinds of decoction samples. Once a clear classification trend was found in score plot, extended statistical analysis was performed to generate S-plot, in which the variables(tR-m/z pair) contributing most to the difference were clearly depicted as points at the two ends of "S", and the components that correlate to these ions were regarded as the most changed components during co-decocting of the incompatible pair. The identities of the changed components can be identified by comparing the retention times and mass spectra with those of reference compounds and/or tentatively assigned by matching empirical molecular formulae with those of the known compounds published in the literatures. Using the proposed approach, global chemical difference was found between mixed decoction and co-decoction, and hypaconitine, mesaconitine, deoxyaconitine, aconitine, 10-OH-mesaconitine, 10-OH-aconitine and deoxyhypaconitine were identified as the most changed toxic components of the combination of WT-BJ incompatible pair during co-decocting. It is suggested that this newly established approach could be used to practically reveal the possible toxic components changed/increased of the herbal combination taboos, e.g. the Eighteen Incompatible Medications(Shi Ba Fan), in traditional Chinese medicines.展开更多
Water quality of Mexican tropical lake Chapala was assessed through multivariate statistical techniques, cluster analysis (CA) and principal component analysis (PCA) at ten different monitoring sites for ten physicoch...Water quality of Mexican tropical lake Chapala was assessed through multivariate statistical techniques, cluster analysis (CA) and principal component analysis (PCA) at ten different monitoring sites for ten physicochemical variables and six metals. This study evaluated and interpreted complex water quality data sets and apportioned of pollution sources to get better information about water quality. From descriptive statistics results, the highest concentrations of metals occurred during the dry season, and this trend was explained by the fact that an unusual rainy event occurred during the month of February 2009 and brought metals into the lake by runoffs from nearby mountains. According to international criteria for water consumption by aquatic organisms [USEPA], only Zn concentration values were below these criteria whereas the values of Ni, Pb, Cd and Fe were above the corresponding values set in these criteria (Ni: 52 μg·L-1, Pb: 2.5 μg·L-1, Cd: 0.25 μg·L-1, and Fe: 1000 μg·L-1). The correlations were observed by PCA, which were used to classify the samples by CA, based on the PCA scores. Seven significant cluster groups of sampling locations—(sites 4 and 5), (sites 3 and 9), (site 7), (site 10), (sites 2 and 6), (site 8) and (site 1)— were detected on the basis of similarity of their water quality. The results revealed that the stress exerted on the lake caused by waste sources follows the order: domestic > agricultural > industrial.展开更多
This paper deals with the results of a hydrogeochemistry study on the thermal waters of the Constantine area, Northeastern Algeria, using geochemical and statistical tools. The samples were collected in December2016 f...This paper deals with the results of a hydrogeochemistry study on the thermal waters of the Constantine area, Northeastern Algeria, using geochemical and statistical tools. The samples were collected in December2016 from twelve hot springs and were analyzed for physicochemical parameters(electric conductivity, p H,total dissolved solids, temperature, Ca, Mg, Na, K, HCO_3,Cl, SO_4, and SiO_2). The waters of the thermal springs have temperatures varying from 28 to 51 °C and electric conductivity values ranging from 853 to 5630 l S/cm. Q-mode Cluster analysis resulted in the determination of two major water types: a Ca–HCO_3–SO_4 type with a moderate salinity and a Na–K–Cl type with high salinity. The plot of the major ions versus the saturation indices suggested that the hydrogeochemistry of thermal groundwater is mainly controlled by dissolution/precipitation of carbonate minerals, dissolution of evaporite minerals(halite and gypsum), and ion exchange of Ca(and/or Mg) by Na. The Gibbs diagram shows that evaporation is another factor playing a minor role. Principal Component Analysis produced three significant factors which have 88.2% of totalvariance that illustrate the main processes controlling the chemistry of groundwaters, which are respectively: the dissolution of evaporite minerals(halite and gypsum), ion exchange, and dissolution/precipitation of carbonate minerals. The subsurface reservoir temperatures were calculated using different cation and silica geothermometers and gave temperatures ranging between 17 and 279 °C. The Na–K and Na–K-Ca geothermometers provided high temperatures(up to 279 °C), whereas, estimated geotemperatures from K/Mg geothermometers were the lowest(17–53 °C). Silica geothermometers gave the most reasonable temperature estimate of the subsurface waters overlap between 20 and 58 °C, which indicate possible mixing with cooler Mg groundwaters indicated by the Na–K–Mg plot in the immature water field and in silica and chloride mixing models. The results of stable isotope analyses(δ^(18) O and δ~2 H) suggest that the origin of thermal water recharge is precipitation, which recharged from a higher altitude(600–1200 m) and infiltrated through deep faults and fractures in carbonate formations. They circulate at an estimated depth that does not exceed 2 km and are heated by a high conductive heat flow before rising to the surface through faults that acted as hydrothermal conduits.During their ascent to the surface, they are subjected to various physical and chemical changes such as cooling by conduction and change in their chemical constituents due to the mixing with cold groundwaters.展开更多
基金financial support of Isfahan University of Technology (IUT) for this research
文摘Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in soil properties. However, there is not enough research quantifying which environmental factor(s) can be the most relevant to predicting soil properties at the catchment scale in semi-arid areas. Thus, this research aims to investigate the ability of multivariate statistical analyses to distinguish which soil properties follow a clear spatial pattern conditioned by specific environmental characteristics in a semi-arid region of Iran. To achieve this goal, we digitized parent materials and landforms by recent orthophotography. Also, we extracted ten topographical attributes and five remote sensing variables from a digital elevation model(DEM) and the Landsat Enhanced Thematic Mapper(ETM), respectively. These factors were contrasted for 334 soil samples(depth of 0–30 cm). Cluster analysis and soil maps reveal that Cluster 1 comprises of limestones, massive limestones and mixed deposits of conglomerates with low soil organic carbon(SOC) and clay contents, and Cluster 2 is composed of soils that originated from quaternary and early quaternary parent materials such as terraces, alluvial fans, lake deposits, and marls or conglomerates that register the highest SOC content and the lowest sand and silt contents. Further, it is confirmed that soils with the highest SOC and clay contents are located in wetlands, lagoons, alluvial fans and piedmonts, while soils with the lowest SOC and clay contents are located in dissected alluvial fans, eroded hills, rock outcrops and steep hills. The results of principal component analysis using the remote sensing data and topographical attributes identify five main components, which explain 73.3% of the total variability of soil properties. Environmental factors such as hillslope morphology and all of the remote sensing variables can largely explain SOC variability, but no significant correlation is found for soil texture and calcium carbonate equivalent contents. Therefore, we conclude that SOC can be considered as the best-predicted soil property in semi-arid regions.
基金supported by the Ministry of Land and Resources of China (No. [2005]011-16)State Environment Protection Administration of China (No. 2001-1-2)+2 种基金State Key Laboratory of Geological Processes and Mineral Resources, China University of Geosciencesthe Guangdong Provincial Office of SciencesTechnology via NSF Team Project and Key Project (Nos. 06202438, 2004A3030800)
文摘Dongguan (东莞) City, located in the Pearl River Delta, South China, is famous for its rapid industrialization in the past 30 years. A total of 90 topsoil samples have been collected from agricultural fields, including vegetable and orchard soils in the city, and eight heavy metals (As, Cu, Cd, Cr, Hg, Ni, Pb, and Zn) and other items (pH values and organic matter) have been analyzed, to evaluate the influence of anthropic activities on the environmental quality of agricultural soils and to identify the spatial distribution of trace elements and possible sources of trace elements. The elements Hg, Pb, and Cd have accumulated remarkably here, incomparison with the soil background content of elements in Guangdong (广东) Province. Pollution is more serious in the western plain and the central region, which are heavily distributed with industries and rivers. Multivariate and geostatistical methods have been applied to differentiate the influences of natural processes and human activities on the pollution of heavy metals in topsoils in the study area. The results of cluster analysis (CA) and factor analysis (FA) show that Ni, Cr, Cu, Zn, and As are grouped in factor F1, Pb in F2, and Cd and Hg in F3, respectively. The spatial pattern of the three factors may be well demonstrated by geostatistical analysis. It is shown that the first factor could be considered as a natural source controlled by parent rocks. The second factor could be referred to as "industrial and traffic pollution sources". The source of the third factor is mainly controlled by long-term anthropic activities, as a consequence of agricultural activities, fossil fuel consumption, and atmospheric deposition.
基金Supported by the National Natural Science Foundation of China (No.60574047) and the Doctorate Foundation of the State Education Ministry of China (No.20050335018).
文摘Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.
基金supposed by the Program for Science and Technology of Shandong Province (2011GHY11521)the Department of Education of Shandong Province (No. J11LB07)the Natural Science Foundation of Qingdao City (Nos. 12-1-3-52-(1)-nsh and 12-1-4-16-(7)-jch)
文摘Meretricis concha is a kind of marine traditional Chinese medicine(TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental fingerprint and the geographical origin identification of Meretricis concha, the elemental contents of M. concha from five sampling points in Rushan Bay have been determined by means of inductively coupled plasma optical emission spectrometry(ICP-OES). Based on the contents of 14 inorganic elements(Al, As, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Se, and Zn), the inorganic elemental fingerprint which well reflects the elemental characteristics was constructed. All the data from the five sampling points were discriminated with accuracy through hierarchical cluster analysis(HCA) and principle component analysis(PCA), indicating that a four-factor model which could explain approximately 80% of the detection data was established, and the elements Al, As, Cd, Cu, Ni and Pb could be viewed as the characteristic elements. This investigation suggests that the inorganic elemental fingerprint combined with multivariate statistical analysis is a promising method for verifying the geographical origin of M. concha, and this strategy should be valuable for the authenticity discrimination of some marine TCM.
基金Supported by the National High-Tech Development Program of China(No.863-511-920-011,2001AA411230).
文摘Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares (PLS) are surveyed in this paper. The four-step procedure of performing MSPM&C for chemical process, modeling of processes, detecting abnormal events or faults, identifying the variable(s) responsible for the faults and diagnosing the source cause for the abnormal behavior, is analyzed. Several main research directions of MSPM&C reported in the literature are discussed, such as multi-way principal component analysis (MPCA) for batch process, statistical monitoring and control for nonlinear process, dynamic PCA and dynamic PLS, and on-line quality control by inferential models. Industrial applications of MSPM&C to several typical chemical processes, such as chemical reactor, distillation column, polymerization process, petroleum refinery units, are summarized. Finally, some concluding remarks and future considerations are made.
基金National Key Research and Development Program of China(2016YFA0600101)National Basic Research Program of China(973 Program,2010CB950802)National Natural Science Fund(41605028)
文摘A technique for estimating tropical cyclone(TC) intensity over the Western North Pacific utilizing FY-3Microwave Imager(MWRI) data is developed. As a first step, we investigated the relationship between the FY-3 MWRI brightness temperature(TB) parameters, which are computed in concentric circles or annuli of different radius in different MWRI frequencies, and the TC maximum wind speed(Vmax) from the TC best track data. We found that the parameters of lower frequency channels' minimum TB, mean TB and ratio of pixels over the threshold TB with a radius of 1.0 or 1.5 degrees from the center give higher correlation. Then by applying principal components analysis(PCA)and multiple regression method, we established an estimation model and evaluated it using independent verification data, with the RMSE being 13 kt. The estimated Vmax is always stronger in the early stages of development, but slightly weaker toward the mature stage, and a reversal of positive and negative bias takes place with a boundary of around 70 kt. For the TC that has a larger error, we found that they are often with less organized and asymmetric cloud pattern, so the classification of TC cloud pattern will help improve the acuracy of the estimated TC intensity, and with the increase of statistical samples the accuracy of the estimated TC intensity will also be improved.
基金Project (2012ZX07501002-001) supported by the Ministry of Science and Technology of China
文摘Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River(SSHR) basin in China,obtained during two years(2012-2013) of monitoring of 10 physicochemical parameters at 15 different sites.The results showed that most of physicochemical parameters varied significantly among the sampling sites.Three significant groups,highly polluted(HP),moderately polluted(MP) and less polluted(LP),of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics.DA identified p H,F,DO,NH3-N,COD and VPhs were the most important parameters contributing to spatial variations of surface water quality.However,DA did not give a considerable data reduction(40% reduction).PCA/FA resulted in three,three and four latent factors explaining 70%,62% and 71% of the total variance in water quality data sets of HP,MP and LP regions,respectively.FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities(point sources:industrial effluents and wastewater treatment plants;non-point sources:domestic sewage,livestock operations and agricultural activities) and natural processes(seasonal effect,and natural inputs).PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters(about 80% reduction) as the most important parameters to explain 72% of the data variation.Thus,this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and,in water quality assessment,identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.
基金The authors would like to thank the Laboratory of Water Engineering,Fasa University for providing the facilities to perform this research.
文摘Groundwater is considered as one of the most important sources for water supply in Iran.The Fasa Plain in Fars Province,Southern Iran is one of the major areas of wheat production using groundwater for irrigation.A large population also uses local groundwater for drinking purposes.Therefore,in this study,this plain was selected to assess the spatial variability of groundwater quality and also to identify main parameters affecting the water quality using multivariate statistical techniques such as Cluster Analysis(CA),Discriminant Analysis(DA),and Principal Component Analysis(PCA).Water quality data was monitored at 22 different wells,for five years(2009-2014)with 10 water quality parameters.By using cluster analysis,the sampling wells were grouped into two clusters with distinct water qualities at different locations.The Lasso Discriminant Analysis(LDA)technique was used to assess the spatial variability of water quality.Based on the results,all of the variables except sodium absorption ratio(SAR)are effective in the LDA model with all variables affording 92.80%correct assignation to discriminate between the clusters from the primary 10 variables.Principal component(PC)analysis and factor analysis reduced the complex data matrix into two main components,accounting for more than 95.93%of the total variance.The first PC contained the parameters of TH,Ca2+,and Mg2+.Therefore,the first dominant factor was hardness.In the second PC,Cl-,SAR,and Na+were the dominant parameters,which may indicate salinity.The originally acquired factors illustrate natural(existence of geological formations)and anthropogenic(improper disposal of domestic and agricultural wastes)factors which affect the groundwater quality.
基金Sponsored by the Scientific Research Foundation for Returned Overseas Chinese Scholars of the Ministry of Education of China
文摘A new method using discriminant analysis and control charts is proposed for monitoring multivariate process operations more reliably.Fisher discriminant analysis (FDA) is used to derive a feature discriminant direction (FDD) between each normal and fault operations,and each FDD thus decided constructs the feature space of each fault operation.Individuals control charts (XmR charts) are used to monitor multivariate processes using the process data projected onto feature spaces.Upper control limit (UCL) and lower control limit (LCL) on each feature space from normal process operation are calculated for XmR charts,and are used to distinguish fault from normal.A variation trend on an XmR chart reveals the type of relevant fault operation.Applications to Tennessee Eastman simulation processes show that this proposed method can result in better monitoring performance than principal component analysis (PCA)-based methods and can better identify step type faults on XmR charts.
文摘Surface water has become one of the most vulnerable resources on the earth due to deterioration of its quality from diverse sources of pollution. Understanding of the spatiotemporal distribution of pollutants and identification of the sources in the river systems is a prerequisite for the protection and sustainable utilization of the water resources. Multivariate statistical techniques such as Principal Component Analysis (PCA) and Factor Analysis (FA) were applied in this study to investigate the temporal and spatial variations of water quality and appoint the major factors of pollution in the Shailmari River system. Water quality data for 14 physicochemical parameters from 11 monitoring sites over the year of 2014 in three sampling seasons were collected and analyzed for this study. Kruskal-Wallis test showed significant (p < 0.01) temporal and spatial variations in all of the water quality parameters of the river water. Principal component analysis (PCA) allowed extracting the contributing parameters affecting the seasonal water quality in the river system. Scatter plots of the PCs showed the tidal and spatial variation within river system and identified parameters controlling the behavior in each case. Factor analysis (FA) further reduced the data and extracted factors which are significantly responsible for water quality variation in the river. The results indicate that the parameters controlling the water quality in different seasons are related with salinity, anthropogenic pollution (sewage disposal, effluents) and agricultural runoff in pre-monsoon;precipitation induced surface runoff in monsoon;and erosion, oxidation or organic pollution (point and non-point sources) in post-monsoon. Therefore, the study reveals the applicability and usefulness of the multivariate statistical methods in assessing water quality of river by identifying the potential environmental factors controlling the water quality in different seasons which might help to better understand, monitor and manage the quality of the water resources.
文摘Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component analysis with discriminatory analysis. Principal component analysis and discriminatory analysis are very important theories in multivariate statistical analysis that has developed quickly in the late thirty years. By means of maximization information method, we choose several earthquake prediction factors whose cumulative proportions of total sam-ple variances are beyond 90% from numerous earthquake prediction factors. The paper applies regression analysis and Mahalanobis discrimination to extrapolating synthetic prediction. Furthermore, we use this model to charac-terize and predict earthquakes in North China (30~42N, 108~125E) and better prediction results are obtained.
文摘Heterogeneity of biological samples is usually considered a major obstacle for three-dimensional (3D) structure determination of macromolecular complexes. Heterogeneity may occur at the level of composition or conformational variability of complexes and affects most 3D structure determination methods that rely on signal averaging. Here, an approach is described that allows sorting structural states based on a 3D statistical approach, the 3D sampling and classification (3D-SC) of 3D structures derived from single particles imaged by cryo electron microscopy (cryo-EM). The method is based on jackknifing & bootstrapping of 3D sub-ensembles and 3D multivariate statistical analysis followed by 3D classification. The robustness of the statistical sorting procedure is corroborated using model data from an RNA polymerase structure and experimental data from a ribosome complex. It allows resolving multiple states within heterogeneous complexes that thus become amendable for a structural analysis despite of their highly flexible nature. The method has important implications for high-resolution structural studies and allows describing structure ensembles to provide insights into the dynamics of multi-component macromolecular assemblies.
基金Supported by the Scientific Research Foundation of Xianyang Normal University for Bringing in Talents(10XSYK104)
文摘[Objective] The study aimed to study the relationship between soil and environment on the basis of multivariate statistical analysis. [ Method] Through field investigation, sampling and laboratory analysis, we discussed the relationship between soil properties and environmental factors in Mizhi County, North Shaanxi by using Canoco multivariate statistical analysis. [ Result]According to the effects of various environmental factors on soil properties, the influencing order of environmental factors was land use way 〉 vegetation type 〉 vegetation restoration years 〉 vegeta- tion coverage 〉 slope aspect 〉 gradient 〉 elevation. In a word, soil properties were significantly affected by land use way and vegetation type which were the most important environmental factors of soil properties in spatial variation, while vegetation restoration years were closely related to the ac- cumulation of soil nutrients. [ Condusion]The research could provide theoretical references for the construction of ecological environment in Loess Plateau of China.
基金The authors thank National Key Research and Development Program of China(2018YFC1705900)National Natural Science Foundation of China(No.81903706)+1 种基金Distinguished professor of Liaoning Province(XLYC2002008)Science Foundation of Department of Education of Liaoning Province(LZ2020054)for financial support.
文摘Jinhongtang is a traditional Chinese medicine formula composed of Rheum palmatum L.stem,Sargentodoxa cuneata stem,and Taraxacum mongolicum and is used for the treatment of sepsis.However,quality assessment method for Jinhongtang is not available.In present study,we developed a UFLC-MS/MS method to determine 16 analytes in 20 batches of home-made and commercial Jinhongtang.Multivariate statistical analysis revealed the significant differences in the quality of home-made and commercial Jinhongtang and the difference in the quality of home-made samples was more significant.The integrated strategy based on UFLC-MS/MS and multivariate statistical analysis provided a new basis for the overall quality assessment of Jinhongtang.
文摘Biology is a challenging and complicated mess. Understanding this challenging complexity is the realm of the biological sciences: Trying to make sense of the massive, messy data in terms of discovering patterns and revealing its underlying general rules. Among the most powerful mathematical tools for organizing and helping to structure complex, heterogeneous and noisy data are the tools provided by multivariate statistical analysis (MSA) approaches. These eigenvector/eigenvalue data-compression approaches were first introduced to electron microscopy (EM) in 1980 to help sort out different views of macromolecules in a micrograph. After 35 years of continuous use and developments, new MSA applications are still being proposed regularly. The speed of computing has increased dramatically in the decades since their first use in electron microscopy. However, we have also seen a possibly even more rapid increase in the size and complexity of the EM data sets to be studied. MSA computations had thus become a very serious bottleneck limiting its general use. The parallelization of our programs—speeding up the process by orders of magnitude—has opened whole new avenues of research. The speed of the automatic classification in the compressed eigenvector space had also become a bottleneck which needed to be removed. In this paper we explain the basic principles of multivariate statistical eigenvector-eigenvalue data compression;we provide practical tips and application examples for those working in structural biology, and we provide the more experienced researcher in this and other fields with the formulas associated with these powerful MSA approaches.
基金supported by the National Basic Research Program of China("973Program)(No2011CB505304)the Youth Scientific Research Project of Anhui Academy of Medical Science(YKY2018003)
文摘In the present study, an ultra performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry(UPLC-QTOF/MS) based chemical profiling approach to rapidly evaluate chemical diversity after codecocting of the combination of Aconitum carmichaeli Debx.(wu-tou in Chinese, WT) and Bletilla striata(Thunb.) Reichb.f.(bai-ji in Chinese, BJ) incompatible pair. Two different kinds of decoctions, namely WT-BJ mixed decoction: mixed water extract of each individual herbs, and WT-BJ co-decoction: water extract of mixed two constituent herbs, were prepared. Batches of these two kinds of decoction samples were subjected to UPLC-QTOF/MS analysis, the datasets of tR-m/z pairs, ion intensities and sample codes were processed with supervised orthogonal partial least squared discriminant analysis(OPLS-DA) to holistically compare the difference between these two kinds of decoction samples. Once a clear classification trend was found in score plot, extended statistical analysis was performed to generate S-plot, in which the variables(tR-m/z pair) contributing most to the difference were clearly depicted as points at the two ends of "S", and the components that correlate to these ions were regarded as the most changed components during co-decocting of the incompatible pair. The identities of the changed components can be identified by comparing the retention times and mass spectra with those of reference compounds and/or tentatively assigned by matching empirical molecular formulae with those of the known compounds published in the literatures. Using the proposed approach, global chemical difference was found between mixed decoction and co-decoction, and hypaconitine, mesaconitine, deoxyaconitine, aconitine, 10-OH-mesaconitine, 10-OH-aconitine and deoxyhypaconitine were identified as the most changed toxic components of the combination of WT-BJ incompatible pair during co-decocting. It is suggested that this newly established approach could be used to practically reveal the possible toxic components changed/increased of the herbal combination taboos, e.g. the Eighteen Incompatible Medications(Shi Ba Fan), in traditional Chinese medicines.
基金the National Council of Science and Technoloy(CONACyT)and the Ministry of Public Education-PROMEP for their support through grants No.84252 and 103.5/13/9346,respectively,and for the scholarship of Jessica Badillo-Camacho from CONACyT.
文摘Water quality of Mexican tropical lake Chapala was assessed through multivariate statistical techniques, cluster analysis (CA) and principal component analysis (PCA) at ten different monitoring sites for ten physicochemical variables and six metals. This study evaluated and interpreted complex water quality data sets and apportioned of pollution sources to get better information about water quality. From descriptive statistics results, the highest concentrations of metals occurred during the dry season, and this trend was explained by the fact that an unusual rainy event occurred during the month of February 2009 and brought metals into the lake by runoffs from nearby mountains. According to international criteria for water consumption by aquatic organisms [USEPA], only Zn concentration values were below these criteria whereas the values of Ni, Pb, Cd and Fe were above the corresponding values set in these criteria (Ni: 52 μg·L-1, Pb: 2.5 μg·L-1, Cd: 0.25 μg·L-1, and Fe: 1000 μg·L-1). The correlations were observed by PCA, which were used to classify the samples by CA, based on the PCA scores. Seven significant cluster groups of sampling locations—(sites 4 and 5), (sites 3 and 9), (site 7), (site 10), (sites 2 and 6), (site 8) and (site 1)— were detected on the basis of similarity of their water quality. The results revealed that the stress exerted on the lake caused by waste sources follows the order: domestic > agricultural > industrial.
基金supported by (Faculty of Earth Science, University of Constantine 1)
文摘This paper deals with the results of a hydrogeochemistry study on the thermal waters of the Constantine area, Northeastern Algeria, using geochemical and statistical tools. The samples were collected in December2016 from twelve hot springs and were analyzed for physicochemical parameters(electric conductivity, p H,total dissolved solids, temperature, Ca, Mg, Na, K, HCO_3,Cl, SO_4, and SiO_2). The waters of the thermal springs have temperatures varying from 28 to 51 °C and electric conductivity values ranging from 853 to 5630 l S/cm. Q-mode Cluster analysis resulted in the determination of two major water types: a Ca–HCO_3–SO_4 type with a moderate salinity and a Na–K–Cl type with high salinity. The plot of the major ions versus the saturation indices suggested that the hydrogeochemistry of thermal groundwater is mainly controlled by dissolution/precipitation of carbonate minerals, dissolution of evaporite minerals(halite and gypsum), and ion exchange of Ca(and/or Mg) by Na. The Gibbs diagram shows that evaporation is another factor playing a minor role. Principal Component Analysis produced three significant factors which have 88.2% of totalvariance that illustrate the main processes controlling the chemistry of groundwaters, which are respectively: the dissolution of evaporite minerals(halite and gypsum), ion exchange, and dissolution/precipitation of carbonate minerals. The subsurface reservoir temperatures were calculated using different cation and silica geothermometers and gave temperatures ranging between 17 and 279 °C. The Na–K and Na–K-Ca geothermometers provided high temperatures(up to 279 °C), whereas, estimated geotemperatures from K/Mg geothermometers were the lowest(17–53 °C). Silica geothermometers gave the most reasonable temperature estimate of the subsurface waters overlap between 20 and 58 °C, which indicate possible mixing with cooler Mg groundwaters indicated by the Na–K–Mg plot in the immature water field and in silica and chloride mixing models. The results of stable isotope analyses(δ^(18) O and δ~2 H) suggest that the origin of thermal water recharge is precipitation, which recharged from a higher altitude(600–1200 m) and infiltrated through deep faults and fractures in carbonate formations. They circulate at an estimated depth that does not exceed 2 km and are heated by a high conductive heat flow before rising to the surface through faults that acted as hydrothermal conduits.During their ascent to the surface, they are subjected to various physical and chemical changes such as cooling by conduction and change in their chemical constituents due to the mixing with cold groundwaters.