The composition control of molten steel is one of the main functions in the ladle furnace(LF)refining process.In this study,a feasible model was established to predict the alloying element yield using principal compon...The composition control of molten steel is one of the main functions in the ladle furnace(LF)refining process.In this study,a feasible model was established to predict the alloying element yield using principal component analysis(PCA)and deep neural network(DNN).The PCA was used to eliminate collinearity and reduce the dimension of the input variables,and then the data processed by PCA were used to establish the DNN model.The prediction hit ratios for the Si element yield in the error ranges of±1%,±3%,and±5%are 54.0%,93.8%,and98.8%,respectively,whereas those of the Mn element yield in the error ranges of±1%,±2%,and±3%are 77.0%,96.3%,and 99.5%,respectively,in the PCA-DNN model.The results demonstrate that the PCA-DNN model performs better than the known models,such as the reference heat method,multiple linear regression,modified backpropagation,and DNN model.Meanwhile,the accurate prediction of the alloying element yield can greatly contribute to realizing a“narrow window”control of composition in molten steel.The construction of the prediction model for the element yield can also provide a reference for the development of an alloying control model in LF intelligent refining in the modern iron and steel industry.展开更多
Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the t...Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.展开更多
To overcome the too fine-grained granularity problem of multivariate grey incidence analysis and to explore the comprehensive incidence analysis model, three multivariate grey incidences degree models based on princip...To overcome the too fine-grained granularity problem of multivariate grey incidence analysis and to explore the comprehensive incidence analysis model, three multivariate grey incidences degree models based on principal component analysis (PCA) are proposed. Firstly, the PCA method is introduced to extract the feature sequences of a behavioral matrix. Then, the grey incidence analysis between two behavioral matrices is transformed into the similarity and nearness measure between their feature sequences. Based on the classic grey incidence analysis theory, absolute and relative incidence degree models for feature sequences are constructed, and a comprehensive grey incidence model is proposed. Furthermore, the properties of models are researched. It proves that the proposed models satisfy the properties of translation invariance, multiple transformation invariance, and axioms of the grey incidence analysis, respectively. Finally, a case is studied. The results illustrate that the model is effective than other multivariate grey incidence analysis models.展开更多
The existing research of process capability indices of multiple quality characteristics mainly focuses on nonconforming of process output, the concept development of tmivariate process capability indices, quality loss...The existing research of process capability indices of multiple quality characteristics mainly focuses on nonconforming of process output, the concept development of tmivariate process capability indices, quality loss function and various comprehensive evaluation methods. The multivariate complexity increases the computation difficulty of multivariate process capability indices(MPCI), which makes them hard to be used in practice. In this paper, a new PCA-based MPCI approach is proposed to assess the production capability of the processes that involve multiple product quality characteristics. This approach first transforms the original quality variables into standardized normal variables. MPCI measures are then provided based on the Taam index. Moreover, the statistical properties of these MPCIs, such as confidence intervals and lower confidence bound, are given to let the practitioners understand the capability indices as random variables instead of deterministic variables. A real manufacturing data set and a synthetic data set are used to demonstrate the effectiveness of the proposed method. An implementation procedure is also provided for quality engineers to apply our MPCI approach in their manufacturing processes. The case studies demonstrate the effectiveness and feasibility of this new kind of MPCI, which is easier to be used in production practice. The proposed research provides a novel approach of MPCI calculation.展开更多
The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs), which indicate the most variance information of normal observations for process monitoring, but m...The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs), which indicate the most variance information of normal observations for process monitoring, but may not reflect the fault information. In this study, sensitive kernel principal component analysis (SKPCA) is proposed to improve process monitoring performance, i.e., to deal with the discordance of T2 statistic and squared prediction error SVE statistic and reduce missed detection rates. T2 statistic can be used to measure the variation di rectly along each KPC and analyze the detection performance as well as capture the most useful information in a process. With the calculation of the change rate of T2 statistic along each KPC, SKPCA selects the sensitive kernel principal components for process monitoring. A simulated simple system and Tennessee Eastman process are employed to demonstrate the efficiency of SKPCA on online monitoring. The results indicate that the monitoring performance is improved significantly.展开更多
Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to ass...Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to assign principal component factor based on cluster analysis of loading matrix and combining with actual meaning and evaluation direction of index categories. To evaluate the richness of its nutrition according to the score of nutrition of fruit and vegetable, finally equivalent replacement suggestions are given in different seasons of vegetables & fruits according to the result of clustering. Studies show that principal component cluster method can not only carry on the reasonable classification of multivariate data effectively, but also make reasonable evaluation on the sample object, and provide powerful basis for evaluation of fruits and vegetables’ nutrition.展开更多
In order to improve reliability of the excavator's hydraulic system, a fault detection approach based on dynamic principal component analysis(PCA) was proposed. Dynamic PCA is an extension of PCA, which can effect...In order to improve reliability of the excavator's hydraulic system, a fault detection approach based on dynamic principal component analysis(PCA) was proposed. Dynamic PCA is an extension of PCA, which can effectively extract the dynamic relations among process variables. With this approach, normal samples were used as training data to develop a dynamic PCA model in the first step. Secondly, the dynamic PCA model decomposed the testing data into projections to the principal component subspace(PCS) and residual subspace(RS). Thirdly, T2 statistic and Q statistic performed as indexes of fault detection in PCS and RS, respectively. Several simulated faults were introduced to validate the approach. The results show that the dynamic PCA model developed is able to detect overall faults by using T2 statistic and Q statistic. By simulation analysis, the proposed approach achieves an accuracy of 95% for 20 test sample sets, which shows that the fault detection approach can be effectively applied to the excavator's hydraulic system.展开更多
To assess the quality of groundwater resources, samples were collected from 22 points for mean annual water years of 2003 and 2015 (mean minimum and maximum water table), and 19 parameters were examined and calculated...To assess the quality of groundwater resources, samples were collected from 22 points for mean annual water years of 2003 and 2015 (mean minimum and maximum water table), and 19 parameters were examined and calculated. One of the objectives of this study was to evaluate the groundwater quality of the Ghaemshahr plain which includes the study of spatial and temporal changes of groundwater quality in different sectors and factors affecting it. In this study, combining statistical methods such as Pearson correlation coefficient, factor analysis, principal component analysis, and combined diagrams with hydrochemical methods are used to assess the chemical quality of groundwater. Samples were categorized by using cluster method and then the same samples were identified. Accordingly, samples were classified in four categories which represent the quality of groundwater in different districts. Factor analysis was used to identify the factors affecting the geochemical processes of the aquifer. Statistical methods showed that they can be used to complete the conventional methods in hydro-geochemistry as well as very precise results can be achieved. Based on the obtained results, saturation index of Ghaemshahr groundwater was super-saturated;and groundwater quality control of Ghaemshahr plain is hold by processes such as dissolution of halide (salt water intrusion of Caspian Sea and brackish fossil aquifers), calcite and dolomite (dissolution of limestone, dolomite, and marl in height), weathering sodium-rich plagioclases (clay minerals), and ion exchange.展开更多
Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares ...Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares (PLS) are surveyed in this paper. The four-step procedure of performing MSPM&C for chemical process, modeling of processes, detecting abnormal events or faults, identifying the variable(s) responsible for the faults and diagnosing the source cause for the abnormal behavior, is analyzed. Several main research directions of MSPM&C reported in the literature are discussed, such as multi-way principal component analysis (MPCA) for batch process, statistical monitoring and control for nonlinear process, dynamic PCA and dynamic PLS, and on-line quality control by inferential models. Industrial applications of MSPM&C to several typical chemical processes, such as chemical reactor, distillation column, polymerization process, petroleum refinery units, are summarized. Finally, some concluding remarks and future considerations are made.展开更多
The principal component analysis (PCA) algorithm is widely applied in a diverse range of fields for performance assessment, fault detection, and diagnosis. However, in the presence of noise and gross errors, the non...The principal component analysis (PCA) algorithm is widely applied in a diverse range of fields for performance assessment, fault detection, and diagnosis. However, in the presence of noise and gross errors, the nonlinear PCA (NLPCA) using autoassociative bottle-neck neural networks is so sensitive that the obtained model differs significantly from the underlying system. In this paper, a robust version of NLPCA is introduced by replacing the generally used error criterion mean squared error with a mean log squared error. This is followed by a concise analysis of the corresponding training method. A novel multivariate statistical process monitoring (MSPM) scheme incorporating the proposed robust NLPCA technique is then investigated and its efficiency is assessed through application to an industrial fluidized catalytic cracking plant. The results demonstrate that, compared with NLPCA, the proposed approach can effectively reduce the number of false alarms and is, hence, expected to better monitor real-world processes.展开更多
Pancreatic cancer(PC) is one of the most aggressive and lethal neoplastic diseases. A valid alternative to the usual invasive diagnostic tools would certainly be the determination of biomarkers in peripheral fluids to...Pancreatic cancer(PC) is one of the most aggressive and lethal neoplastic diseases. A valid alternative to the usual invasive diagnostic tools would certainly be the determination of biomarkers in peripheral fluids to provide less invasive tools for early diagnosis. Nowadays, biomarkers are generally investigated mainly in peripheral blood and tissues through high-throughput omics techniques comparing control vs pathological samples. The results can be evaluated by two main strategies:(1) classical methods in which the identification of significant biomarkers is accomplished by monovariate statistical tests where each biomarker is considered as independent from the others; and(2) multivariate methods, taking into consideration the correlations existing among the biomarkers themselves. This last approach is very powerful since it allows the identification of pools of biomarkers with diagnostic and prognostic performances which are superior to single markers in terms of sensitivity, specificity and robustness. Multivariate techniques are usually applied with variable selection procedures to provide a restricted set of biomarkers with the best predictive ability; however, standard selection methods are usually aimed at the identification of the smallest set of variables with the best predictive ability and exhaustivity is usually neglected. The exhaustive search for biomarkers is instead an important alternative to standard variable selection since it can provide information about the etiology of the pathology by producing a comprehensive set of markers. In this review, the most recent applications of the omics techniques(proteomics, genomics and metabolomics) to the identification of exploratory biomarkers for PC will be presented with particular regard to the statistical methods adopted for their identification. The basic theory related to classical and multivariate methods for identification of biomarkers is presented and then, the most recent applications in this field are discussed.展开更多
Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality dat...Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River(SSHR) basin in China,obtained during two years(2012-2013) of monitoring of 10 physicochemical parameters at 15 different sites.The results showed that most of physicochemical parameters varied significantly among the sampling sites.Three significant groups,highly polluted(HP),moderately polluted(MP) and less polluted(LP),of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics.DA identified p H,F,DO,NH3-N,COD and VPhs were the most important parameters contributing to spatial variations of surface water quality.However,DA did not give a considerable data reduction(40% reduction).PCA/FA resulted in three,three and four latent factors explaining 70%,62% and 71% of the total variance in water quality data sets of HP,MP and LP regions,respectively.FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities(point sources:industrial effluents and wastewater treatment plants;non-point sources:domestic sewage,livestock operations and agricultural activities) and natural processes(seasonal effect,and natural inputs).PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters(about 80% reduction) as the most important parameters to explain 72% of the data variation.Thus,this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and,in water quality assessment,identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.展开更多
Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component...Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component analysis with discriminatory analysis. Principal component analysis and discriminatory analysis are very important theories in multivariate statistical analysis that has developed quickly in the late thirty years. By means of maximization information method, we choose several earthquake prediction factors whose cumulative proportions of total sam-ple variances are beyond 90% from numerous earthquake prediction factors. The paper applies regression analysis and Mahalanobis discrimination to extrapolating synthetic prediction. Furthermore, we use this model to charac-terize and predict earthquakes in North China (30~42N, 108~125E) and better prediction results are obtained.展开更多
Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale ...Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale parameters of the Gaussian kernel,the multi-scale representation of the original image data could be obtained and used to constitute the multi- variate image,in which each channel could represent a perceptual observation of the original image from different scales.The Multivariate Image Analysis (MIA) techniques were used to extract defect features information.The MIA combined Principal Component Analysis (PCA) to obtain the principal component scores of the multivariate test image.The Q-statistic image, derived from the residuals after the extraction of the first principal component score and noise,could be used to efficiently reveal the surface defects with an appropriate threshold value decided by training images.Experimental results show that the proposed method performs better than the gray histogram-based method.It has less sensitivity to the inhomogeneous of illumination,and has more robustness and reliability of defect detection with lower pseudo reject rate.展开更多
After 30 years of economic development, the high-tech industry has played </span><span style="font-family:Verdana;">an </span><span style="font-family:Verdana;">important ro...After 30 years of economic development, the high-tech industry has played </span><span style="font-family:Verdana;">an </span><span style="font-family:Verdana;">important role in China’s national economy. The development of high-level</span><span style="font-family:"font-size:10pt;"> </span><span style="font-family:Verdana;">technological industry plays a leading role in guiding the transformation of </span><span style="font-family:Verdana;">China’s economy from “investment-driven” to “technology-driven”. The</span><span style="font-family:Verdana;"> high-tech industry represents the future industrial development direction and plays a positive role in promoting the transformation of traditional industries. The rapid development of high-tech industry is the key to social progress. In this paper, the traditional analytical model of statistics is combined with principal component analysis and spatial analysis, and R language is used to express the analytical results intuitively on the map. Finally, a comprehensive evaluation is established.展开更多
Increasing contamination of water resources in the world and our country and decreasing water quality over time, not having met the objectives of utilization of water resources;it has increased the importance of water...Increasing contamination of water resources in the world and our country and decreasing water quality over time, not having met the objectives of utilization of water resources;it has increased the importance of water management. The monitoring of the water resources and evaluation of these monitoring results have given direction to the studies’ outcome in order to control factors that pollute water resources and reduce water quality. Nilüfer Creek is very important for both being a source of drinking and potable water and a discharge area for wastewaters for the city of Bursa. In this study, the results of the analysis belonging to the period between 2002-2010 which are taken from 15 points by General Directorate of Bursa Water and Sewerage Administration (BUWSA) were evaluated in relation to water quality of the Nilüfer Creek. Non-parametric methods were used in the evaluation of the water quality data due to the lack of normally distributed data. The identification of the best represented parameters of the water quality was provided by applying Principal Component Analysis. According to results of the analysis, the best representative 9 parameters from the 19 water quality parameters were defined as parameters of BOD5, COD, TSS, T.Fe, Zn, conductivity, NO2-N, Ni and NO3-N that taking part of the first two components.展开更多
Principal component analysis(PCA)is employed to extract the principal components(PCs)present in nuclear mass models for the first time.The effects from different nuclear mass models are reintegrated and reorganized in...Principal component analysis(PCA)is employed to extract the principal components(PCs)present in nuclear mass models for the first time.The effects from different nuclear mass models are reintegrated and reorganized in the extracted PCs.These PCs are recombined to build new mass models,which achieve better accuracy than the original theoretical mass models.This comparison indicates that using the PCA approach,the effects contained in different mass models can be collaborated to improve nuclear mass predictions.展开更多
In order to reduce the variations of the product quality in batch processes, multivariate statistical process control methods according to multi-way principal component analysis (MPCA) or multi-way projection to laten...In order to reduce the variations of the product quality in batch processes, multivariate statistical process control methods according to multi-way principal component analysis (MPCA) or multi-way projection to latent structure (MPLS) were proposed for on-line batch process monitoring. However, they are based on the decomposition of relative covariance matrix and strongly affected by outlying observations. In this paper, in view of an efficient projection pursuit algorithm, a robust statistical batch process monitoring (RSBPM) framework,which is resistant to outliers, is proposed to reduce the high demand for modeling data. The construction of robust normal operating condition model and robust control limits are discussed in detail. It is evaluated on monitoring an industrial streptomycin fermentation process and compared with the conventional MPCA. The results show that the RSBPM framework is resistant to possible outliers and the robustness is confirmed.展开更多
The purpose of this research was to develop a new approach in determination of overhaul and maintenance cost of loading equipment in surface mining. Two statistical models including univariate exponential regression (...The purpose of this research was to develop a new approach in determination of overhaul and maintenance cost of loading equipment in surface mining. Two statistical models including univariate exponential regression (UER) and multivariate linear regression (MLR) were used in this study. Loading equipment parameters such as bucket capacity, machine weight, engine power, boom length, digging depth, and dumping height were considered as variables. The results obtained by models and mean absolute error rate indicate that these models can be applied as the useful tool in determination of overhaul and maintenance cost of loading equipment. The results of this study can be used by the decision-makers for the specific surface mining operations.展开更多
基金supported by the National Natural Science Foundation of China(No.51974023)State Key Laboratory of Advanced Metallurgy,University of Science and Technology Beijing(No.41621005)。
文摘The composition control of molten steel is one of the main functions in the ladle furnace(LF)refining process.In this study,a feasible model was established to predict the alloying element yield using principal component analysis(PCA)and deep neural network(DNN).The PCA was used to eliminate collinearity and reduce the dimension of the input variables,and then the data processed by PCA were used to establish the DNN model.The prediction hit ratios for the Si element yield in the error ranges of±1%,±3%,and±5%are 54.0%,93.8%,and98.8%,respectively,whereas those of the Mn element yield in the error ranges of±1%,±2%,and±3%are 77.0%,96.3%,and 99.5%,respectively,in the PCA-DNN model.The results demonstrate that the PCA-DNN model performs better than the known models,such as the reference heat method,multiple linear regression,modified backpropagation,and DNN model.Meanwhile,the accurate prediction of the alloying element yield can greatly contribute to realizing a“narrow window”control of composition in molten steel.The construction of the prediction model for the element yield can also provide a reference for the development of an alloying control model in LF intelligent refining in the modern iron and steel industry.
基金Supported by the National Natural Science Foundation of China (No.60574047) and the Doctorate Foundation of the State Education Ministry of China (No.20050335018).
文摘Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.
基金supported by the National Natural Science Foundation of China(71401052)the Key Project of National Social Science Fund of China(12AZD108)+2 种基金the Doctoral Fund of Ministry of Education(20120094120024)the Philosophy and Social Science Fund of Jiangsu Province Universities(2013SJD630073)the Central University Basic Service Project Fee of Hohai University(2011B09914)
文摘To overcome the too fine-grained granularity problem of multivariate grey incidence analysis and to explore the comprehensive incidence analysis model, three multivariate grey incidences degree models based on principal component analysis (PCA) are proposed. Firstly, the PCA method is introduced to extract the feature sequences of a behavioral matrix. Then, the grey incidence analysis between two behavioral matrices is transformed into the similarity and nearness measure between their feature sequences. Based on the classic grey incidence analysis theory, absolute and relative incidence degree models for feature sequences are constructed, and a comprehensive grey incidence model is proposed. Furthermore, the properties of models are researched. It proves that the proposed models satisfy the properties of translation invariance, multiple transformation invariance, and axioms of the grey incidence analysis, respectively. Finally, a case is studied. The results illustrate that the model is effective than other multivariate grey incidence analysis models.
基金supported by National Natural Science Foundation of China(Grant Nos.70802043,71225006 and 71002105)
文摘The existing research of process capability indices of multiple quality characteristics mainly focuses on nonconforming of process output, the concept development of tmivariate process capability indices, quality loss function and various comprehensive evaluation methods. The multivariate complexity increases the computation difficulty of multivariate process capability indices(MPCI), which makes them hard to be used in practice. In this paper, a new PCA-based MPCI approach is proposed to assess the production capability of the processes that involve multiple product quality characteristics. This approach first transforms the original quality variables into standardized normal variables. MPCI measures are then provided based on the Taam index. Moreover, the statistical properties of these MPCIs, such as confidence intervals and lower confidence bound, are given to let the practitioners understand the capability indices as random variables instead of deterministic variables. A real manufacturing data set and a synthetic data set are used to demonstrate the effectiveness of the proposed method. An implementation procedure is also provided for quality engineers to apply our MPCI approach in their manufacturing processes. The case studies demonstrate the effectiveness and feasibility of this new kind of MPCI, which is easier to be used in production practice. The proposed research provides a novel approach of MPCI calculation.
基金Supported by the 973 project of China (2013CB733600), the National Natural Science Foundation (21176073), the Doctoral Fund of Ministry of Education (20090074110005), the New Century Excellent Talents in University (NCET-09-0346), "Shu Guang" project (09SG29) and the Fundamental Research Funds for the Central Universities.
文摘The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs), which indicate the most variance information of normal observations for process monitoring, but may not reflect the fault information. In this study, sensitive kernel principal component analysis (SKPCA) is proposed to improve process monitoring performance, i.e., to deal with the discordance of T2 statistic and squared prediction error SVE statistic and reduce missed detection rates. T2 statistic can be used to measure the variation di rectly along each KPC and analyze the detection performance as well as capture the most useful information in a process. With the calculation of the change rate of T2 statistic along each KPC, SKPCA selects the sensitive kernel principal components for process monitoring. A simulated simple system and Tennessee Eastman process are employed to demonstrate the efficiency of SKPCA on online monitoring. The results indicate that the monitoring performance is improved significantly.
文摘Utilizing principal component analysis (PCA) and cluster analysis, the standardization, dimension-reduction and de-correlation of multiple evaluation index system for fruit and vegetable nutrition are performed to assign principal component factor based on cluster analysis of loading matrix and combining with actual meaning and evaluation direction of index categories. To evaluate the richness of its nutrition according to the score of nutrition of fruit and vegetable, finally equivalent replacement suggestions are given in different seasons of vegetables & fruits according to the result of clustering. Studies show that principal component cluster method can not only carry on the reasonable classification of multivariate data effectively, but also make reasonable evaluation on the sample object, and provide powerful basis for evaluation of fruits and vegetables’ nutrition.
基金Project(2003AA430200) supported by the National High-Tech Research and Development Program of China
文摘In order to improve reliability of the excavator's hydraulic system, a fault detection approach based on dynamic principal component analysis(PCA) was proposed. Dynamic PCA is an extension of PCA, which can effectively extract the dynamic relations among process variables. With this approach, normal samples were used as training data to develop a dynamic PCA model in the first step. Secondly, the dynamic PCA model decomposed the testing data into projections to the principal component subspace(PCS) and residual subspace(RS). Thirdly, T2 statistic and Q statistic performed as indexes of fault detection in PCS and RS, respectively. Several simulated faults were introduced to validate the approach. The results show that the dynamic PCA model developed is able to detect overall faults by using T2 statistic and Q statistic. By simulation analysis, the proposed approach achieves an accuracy of 95% for 20 test sample sets, which shows that the fault detection approach can be effectively applied to the excavator's hydraulic system.
文摘To assess the quality of groundwater resources, samples were collected from 22 points for mean annual water years of 2003 and 2015 (mean minimum and maximum water table), and 19 parameters were examined and calculated. One of the objectives of this study was to evaluate the groundwater quality of the Ghaemshahr plain which includes the study of spatial and temporal changes of groundwater quality in different sectors and factors affecting it. In this study, combining statistical methods such as Pearson correlation coefficient, factor analysis, principal component analysis, and combined diagrams with hydrochemical methods are used to assess the chemical quality of groundwater. Samples were categorized by using cluster method and then the same samples were identified. Accordingly, samples were classified in four categories which represent the quality of groundwater in different districts. Factor analysis was used to identify the factors affecting the geochemical processes of the aquifer. Statistical methods showed that they can be used to complete the conventional methods in hydro-geochemistry as well as very precise results can be achieved. Based on the obtained results, saturation index of Ghaemshahr groundwater was super-saturated;and groundwater quality control of Ghaemshahr plain is hold by processes such as dissolution of halide (salt water intrusion of Caspian Sea and brackish fossil aquifers), calcite and dolomite (dissolution of limestone, dolomite, and marl in height), weathering sodium-rich plagioclases (clay minerals), and ion exchange.
基金Supported by the National High-Tech Development Program of China(No.863-511-920-011,2001AA411230).
文摘Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares (PLS) are surveyed in this paper. The four-step procedure of performing MSPM&C for chemical process, modeling of processes, detecting abnormal events or faults, identifying the variable(s) responsible for the faults and diagnosing the source cause for the abnormal behavior, is analyzed. Several main research directions of MSPM&C reported in the literature are discussed, such as multi-way principal component analysis (MPCA) for batch process, statistical monitoring and control for nonlinear process, dynamic PCA and dynamic PLS, and on-line quality control by inferential models. Industrial applications of MSPM&C to several typical chemical processes, such as chemical reactor, distillation column, polymerization process, petroleum refinery units, are summarized. Finally, some concluding remarks and future considerations are made.
基金Supported by the National High-Tech Research and Development (863) Program of China (No. 2001AA413320)
文摘The principal component analysis (PCA) algorithm is widely applied in a diverse range of fields for performance assessment, fault detection, and diagnosis. However, in the presence of noise and gross errors, the nonlinear PCA (NLPCA) using autoassociative bottle-neck neural networks is so sensitive that the obtained model differs significantly from the underlying system. In this paper, a robust version of NLPCA is introduced by replacing the generally used error criterion mean squared error with a mean log squared error. This is followed by a concise analysis of the corresponding training method. A novel multivariate statistical process monitoring (MSPM) scheme incorporating the proposed robust NLPCA technique is then investigated and its efficiency is assessed through application to an industrial fluidized catalytic cracking plant. The results demonstrate that, compared with NLPCA, the proposed approach can effectively reduce the number of false alarms and is, hence, expected to better monitor real-world processes.
文摘Pancreatic cancer(PC) is one of the most aggressive and lethal neoplastic diseases. A valid alternative to the usual invasive diagnostic tools would certainly be the determination of biomarkers in peripheral fluids to provide less invasive tools for early diagnosis. Nowadays, biomarkers are generally investigated mainly in peripheral blood and tissues through high-throughput omics techniques comparing control vs pathological samples. The results can be evaluated by two main strategies:(1) classical methods in which the identification of significant biomarkers is accomplished by monovariate statistical tests where each biomarker is considered as independent from the others; and(2) multivariate methods, taking into consideration the correlations existing among the biomarkers themselves. This last approach is very powerful since it allows the identification of pools of biomarkers with diagnostic and prognostic performances which are superior to single markers in terms of sensitivity, specificity and robustness. Multivariate techniques are usually applied with variable selection procedures to provide a restricted set of biomarkers with the best predictive ability; however, standard selection methods are usually aimed at the identification of the smallest set of variables with the best predictive ability and exhaustivity is usually neglected. The exhaustive search for biomarkers is instead an important alternative to standard variable selection since it can provide information about the etiology of the pathology by producing a comprehensive set of markers. In this review, the most recent applications of the omics techniques(proteomics, genomics and metabolomics) to the identification of exploratory biomarkers for PC will be presented with particular regard to the statistical methods adopted for their identification. The basic theory related to classical and multivariate methods for identification of biomarkers is presented and then, the most recent applications in this field are discussed.
基金Project (2012ZX07501002-001) supported by the Ministry of Science and Technology of China
文摘Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River(SSHR) basin in China,obtained during two years(2012-2013) of monitoring of 10 physicochemical parameters at 15 different sites.The results showed that most of physicochemical parameters varied significantly among the sampling sites.Three significant groups,highly polluted(HP),moderately polluted(MP) and less polluted(LP),of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics.DA identified p H,F,DO,NH3-N,COD and VPhs were the most important parameters contributing to spatial variations of surface water quality.However,DA did not give a considerable data reduction(40% reduction).PCA/FA resulted in three,three and four latent factors explaining 70%,62% and 71% of the total variance in water quality data sets of HP,MP and LP regions,respectively.FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities(point sources:industrial effluents and wastewater treatment plants;non-point sources:domestic sewage,livestock operations and agricultural activities) and natural processes(seasonal effect,and natural inputs).PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters(about 80% reduction) as the most important parameters to explain 72% of the data variation.Thus,this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and,in water quality assessment,identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.
文摘Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component analysis with discriminatory analysis. Principal component analysis and discriminatory analysis are very important theories in multivariate statistical analysis that has developed quickly in the late thirty years. By means of maximization information method, we choose several earthquake prediction factors whose cumulative proportions of total sam-ple variances are beyond 90% from numerous earthquake prediction factors. The paper applies regression analysis and Mahalanobis discrimination to extrapolating synthetic prediction. Furthermore, we use this model to charac-terize and predict earthquakes in North China (30~42N, 108~125E) and better prediction results are obtained.
基金supported in part by the Natural Science Foundation of China (NSFC) (Grant No:50875240).
文摘Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale parameters of the Gaussian kernel,the multi-scale representation of the original image data could be obtained and used to constitute the multi- variate image,in which each channel could represent a perceptual observation of the original image from different scales.The Multivariate Image Analysis (MIA) techniques were used to extract defect features information.The MIA combined Principal Component Analysis (PCA) to obtain the principal component scores of the multivariate test image.The Q-statistic image, derived from the residuals after the extraction of the first principal component score and noise,could be used to efficiently reveal the surface defects with an appropriate threshold value decided by training images.Experimental results show that the proposed method performs better than the gray histogram-based method.It has less sensitivity to the inhomogeneous of illumination,and has more robustness and reliability of defect detection with lower pseudo reject rate.
文摘After 30 years of economic development, the high-tech industry has played </span><span style="font-family:Verdana;">an </span><span style="font-family:Verdana;">important role in China’s national economy. The development of high-level</span><span style="font-family:"font-size:10pt;"> </span><span style="font-family:Verdana;">technological industry plays a leading role in guiding the transformation of </span><span style="font-family:Verdana;">China’s economy from “investment-driven” to “technology-driven”. The</span><span style="font-family:Verdana;"> high-tech industry represents the future industrial development direction and plays a positive role in promoting the transformation of traditional industries. The rapid development of high-tech industry is the key to social progress. In this paper, the traditional analytical model of statistics is combined with principal component analysis and spatial analysis, and R language is used to express the analytical results intuitively on the map. Finally, a comprehensive evaluation is established.
文摘Increasing contamination of water resources in the world and our country and decreasing water quality over time, not having met the objectives of utilization of water resources;it has increased the importance of water management. The monitoring of the water resources and evaluation of these monitoring results have given direction to the studies’ outcome in order to control factors that pollute water resources and reduce water quality. Nilüfer Creek is very important for both being a source of drinking and potable water and a discharge area for wastewaters for the city of Bursa. In this study, the results of the analysis belonging to the period between 2002-2010 which are taken from 15 points by General Directorate of Bursa Water and Sewerage Administration (BUWSA) were evaluated in relation to water quality of the Nilüfer Creek. Non-parametric methods were used in the evaluation of the water quality data due to the lack of normally distributed data. The identification of the best represented parameters of the water quality was provided by applying Principal Component Analysis. According to results of the analysis, the best representative 9 parameters from the 19 water quality parameters were defined as parameters of BOD5, COD, TSS, T.Fe, Zn, conductivity, NO2-N, Ni and NO3-N that taking part of the first two components.
基金supported by the State Key Laboratory of Nuclear Physics and Technology,Peking University(Grant No.NPT2023KFY02)the China Postdoctoral Science Foundation(Grant No.2021M700256)+2 种基金the National Key R&D Program of China(Grant No.2018YFA0404400)the National Natural Science Foundation of China(Grant Nos.11935003,11975031,12141501,and 12070131001)the High-performance Computing Platform of Peking University。
文摘Principal component analysis(PCA)is employed to extract the principal components(PCs)present in nuclear mass models for the first time.The effects from different nuclear mass models are reintegrated and reorganized in the extracted PCs.These PCs are recombined to build new mass models,which achieve better accuracy than the original theoretical mass models.This comparison indicates that using the PCA approach,the effects contained in different mass models can be collaborated to improve nuclear mass predictions.
文摘In order to reduce the variations of the product quality in batch processes, multivariate statistical process control methods according to multi-way principal component analysis (MPCA) or multi-way projection to latent structure (MPLS) were proposed for on-line batch process monitoring. However, they are based on the decomposition of relative covariance matrix and strongly affected by outlying observations. In this paper, in view of an efficient projection pursuit algorithm, a robust statistical batch process monitoring (RSBPM) framework,which is resistant to outliers, is proposed to reduce the high demand for modeling data. The construction of robust normal operating condition model and robust control limits are discussed in detail. It is evaluated on monitoring an industrial streptomycin fermentation process and compared with the conventional MPCA. The results show that the RSBPM framework is resistant to possible outliers and the robustness is confirmed.
文摘The purpose of this research was to develop a new approach in determination of overhaul and maintenance cost of loading equipment in surface mining. Two statistical models including univariate exponential regression (UER) and multivariate linear regression (MLR) were used in this study. Loading equipment parameters such as bucket capacity, machine weight, engine power, boom length, digging depth, and dumping height were considered as variables. The results obtained by models and mean absolute error rate indicate that these models can be applied as the useful tool in determination of overhaul and maintenance cost of loading equipment. The results of this study can be used by the decision-makers for the specific surface mining operations.