Benzoic acid (BA), methylparaben (MP), propylparaben (PP) and sorbic acid (SA) are food preservatives, and they have well defined UV spectra. However, their spectra overlap seriously, and it is difficult to de...Benzoic acid (BA), methylparaben (MP), propylparaben (PP) and sorbic acid (SA) are food preservatives, and they have well defined UV spectra. However, their spectra overlap seriously, and it is difficult to determine them individually from their mixtures without preseparation. In this paper, seven different chemometric approaches were applied to resolve the overlapping spectra and to determine these compounds simultaneously. With respect to the criteria of % relative prediction error (RPE) and % recovery, principal component regression (PCR) and radial basis function-artificial neural network (RBF-ANN) were the preferred methods. These two methods were successfully applied to the analysis of some commercial samples.展开更多
In addition to the conventional methods of the calibration model construction, such as PCR (principal components regression) and PLS (partial least-squares), a MPM (mathematical programming method) is developed ...In addition to the conventional methods of the calibration model construction, such as PCR (principal components regression) and PLS (partial least-squares), a MPM (mathematical programming method) is developed and proposed for practical use in NIR analyses of agricultural and food products. The proposed method involves the mathematical programming techniques to seek the regression coefficients for the calibration model calculation. It is based on the optimization theory used for finding the extremum of the objective function in the given domain of a vector space and employs the method of the complementarity problems solving. The MPM algorithm is described in detail. The MPM was tested on an InfraLUM FT-10 NIR analyzer of Lumex company with samples of dry milk (for fat), corn (for protein) and rye flour (for moisture). The obtained results show that the MPM can be used for constructing multivariate calibrations with the qualitative characteristics superior over those of the classical PCR and PLS methods of analysis.展开更多
Corn steep liquor(CSL) is an important raw material that has high nutritional value and serves as a nitrogen source.Biotin in CSL is especially of great importance to fermentation.In order to develop a fast,versatile,...Corn steep liquor(CSL) is an important raw material that has high nutritional value and serves as a nitrogen source.Biotin in CSL is especially of great importance to fermentation.In order to develop a fast,versatile,cheap,and environmentally safe analytical method for quantifying vitamins B2(VB2),B3(VB3),B6(VB6) and B7(VB7) in CSL,the near-infrared spectroscopy(NIR) measurements of 66 samples(22 batches) of CSL were analyzed by partial least-square regression(PLSR).Multivariate models developed in the NIR regions showed good predictive abilities for VB2,VB3,VB6 and VB7.Results confirmed the probability of the multivariate spectroscopic approach as a replacement for expensive and time-consuming conventional chemical methods.展开更多
Multivariate calibration is an important tool for spectroscopic measurermnent of analyte con-centrations.We present a detailed study of a hybrid multivariate calibration technique,con-strained regularization(CR),and d...Multivariate calibration is an important tool for spectroscopic measurermnent of analyte con-centrations.We present a detailed study of a hybrid multivariate calibration technique,con-strained regularization(CR),and demonstrate its utility in noninvasive glucose sensing uasing Raman spectroscopy.Similar to partial least squares(PIS)and principal component regression(PCR),CR builds an implicit model and requires knowledge only of the concentrations of the analyte of interest.Calibration is treated as an inverse problem in which an optimal balance between model complexity and noise rejection is achieved.Prior information is included in the form of a spectroscopic constraint that can be obtained conveniently.When used with an appropriate constraint,CR provides a better calibration model compared to PLS in both numerical and experimental studies.展开更多
Although near infrared (NIR) spectroscopy has been evaluated for numerous applications, the number of actual on-line or even on-site industrial applications seems to be very limited. In the present paper, the attempts...Although near infrared (NIR) spectroscopy has been evaluated for numerous applications, the number of actual on-line or even on-site industrial applications seems to be very limited. In the present paper, the attempts to produce online predictions of the chemical oxygen demand (COD) in wastewater from a pulp and paper mill using NIR spectroscopy are described. The task was perceived as very challenging, but with a root mean square error of prediction of 149 mg/l, roughly corresponding to 1/10 of the studied concentration interval, this attempt was deemed as successful. This result was obtained by using partial least squares model regression, interpolated reference values for calibration purposes, and by evenly distributing the calibration data in the concentration space. This work may also represent the first industrial application of online COD measurements in wastewater using NIR spectroscopy.展开更多
A global optimum location algorithm called Variable Step-Size Generalized Simulated Annealing(VSGSA) was applied to treating the data obtained by using an array of ion-electrodes in solutions containing mixtures of Na...A global optimum location algorithm called Variable Step-Size Generalized Simulated Annealing(VSGSA) was applied to treating the data obtained by using an array of ion-electrodes in solutions containing mixtures of Na+, K+, Ca2+. Unlike traditional optimization algorithms such as simplex procedure, VSGSA can be used to determine the model parameters without any priori information about the analytical system under investigation and overcome the disadvantage of simplex method which might converge to local extrema depending on the starting positions. The algorithm was applied to po-tentiometric determination of ions in mixture solutions.展开更多
A multivariate statistical downscaling method is developed to produce regional, high-resolution, coastal surface wind fields based on the IPCC global model predictions for the U.S. east coastal ocean, the Gulf of Mexi...A multivariate statistical downscaling method is developed to produce regional, high-resolution, coastal surface wind fields based on the IPCC global model predictions for the U.S. east coastal ocean, the Gulf of Mexico(GOM), and the Caribbean Sea. The statistical relationship is built upon linear regressions between the empirical orthogonal function(EOF) spaces of a cross- calibrated, multi-platform, multi-instrument ocean surface wind velocity dataset(predictand) and the global NCEP wind reanalysis(predictor) over a 10 year period from 2000 to 2009. The statistical relationship is validated before applications and its effectiveness is confirmed by the good agreement between downscaled wind fields based on the NCEP reanalysis and in-situ surface wind measured at 16 National Data Buoy Center(NDBC) buoys in the U.S. east coastal ocean and the GOM during 1992–1999. The predictand-predictor relationship is applied to IPCC GFDL model output(2.0?×2.5?) of downscaled coastal wind at 0.25?×0.25? resolution. The temporal and spatial variability of future predicted wind speeds and wind energy potential over the study region are further quantified. It is shown that wind speed and power would significantly be reduced in the high CO_2 climate scenario offshore of the mid-Atlantic and northeast U.S., with the speed falling to one quarter of its original value.展开更多
The application of Raman spectroscopic techniques combined with multivariate chemometrics signal processing promise new means for the rapid multidimensional analysis of metabolites non-destructively, with little or no...The application of Raman spectroscopic techniques combined with multivariate chemometrics signal processing promise new means for the rapid multidimensional analysis of metabolites non-destructively, with little or no sample preparation and little sensitivity to water. However, Rayleigh scattering, fluorescence and uncontrolled variance present substantial challenges for the accurate quantitative analysis of metabolites at physiological levels in bio- logically varying samples. Effective strategies include the application of chemometrics pretreatments for reducing Raman spectral interference. However, the arbitrary application of individual or combined pretreatment procedures can significantly alter the outcome of a measurement, thereby complicating spectral analysis. This paper evaluates and compares six signal pretreatment methods for correcting the baseline variances, together with three variable se- lection methods for eliminating uninformative variables, all within the context of multivariate calibration models based on partial least squares (PLS) regression. Raman spectra of 90 artificial bio-fluid samples with eight urine metabolites at near-physiological concentrations were used to test these models. The combination of multiplicative scatter correction (MSC), continuous wavelet transform (CWT), randomization test (RT) and PLS modeling pre- sented the best performance for all the metabolites. The correlation coefficient (R) between predicted and prepared concentration reached as high as 0.96.展开更多
Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared(NIR) spectroscopic analysis.A strategy for improving the performance of consensus methods in multi...Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared(NIR) spectroscopic analysis.A strategy for improving the performance of consensus methods in multivariate calibration of NIR spectra is proposed.In the approach,a subset of non-collinear variables is generated using successive projections algorithm(SPA) for each variable in the reduced spectra by uninformative variables elimination(UVE).Then sub-models are built using the variable subsets and the calibration subsets determined by Monte Carlo(MC) re-sampling,and the sub-model that produces minimal error in cross validation is selected as a member model.With repetition of the MC re-sampling,a series of member models are built and a consensus model is achieved by averaging all the member models.Since member models are built with the best variable subset and the randomly selected calibration subset,both the quality and the diversity of the member models are insured for the consensus model.Two NIR spectral datasets of tobacco lamina are used to investigate the proposed method.The superiority of the method in both accuracy and reliability is demonstrated.展开更多
Chlorinated paraffins(CPs) are potential persistent organic pollutants(POPs), which threat the safety of environment and organisms. However, the analysis of CPs is a difficult task due to their complex composition...Chlorinated paraffins(CPs) are potential persistent organic pollutants(POPs), which threat the safety of environment and organisms. However, the analysis of CPs is a difficult task due to their complex composition containing thousands of congeners. In the present work, quantitative structure retention relationship(QSRR) of CPs was studied. A total of 470 molecular descriptors were generated, for describing the structures of 28 CPs and 12 descriptors relevant to retention time of the CPs were selected by stepwise regression. Then, QSRR models between retention time on the one hand and the selected descriptors on the other hand were established by multiple linear regres- sion(MLR), partial least squares(PLS) and least square support vector regression(LS-SVR). The result shows that PLS model is better than MLR and LS-SVR, obtaining a squared correlation coefficient(r2) of 0.9996 and a root mean squared error(RMSE) of 0.015. The PLS model was then used to predict the retention time of 49 C10-CPs. Three of them were investigated by gas chromatography coupled with mass spectrometry(GC-MS). A well-defined correlation was found between the measured retention time and the predicted value.展开更多
基金the financial support of this study by the National Natural Science Foundation of China(No.20562009)the State Key Laboratory of Food Science and Technology of Nanchang University(No.SKLF-TS-200819 and -MB-200807)+1 种基金the Jiangxi Province Natural Science Foundation(No.JXNSF0620041)the program for Changjiang Scholars and Innovative Research Team in Universities(No.IRT0540).
文摘Benzoic acid (BA), methylparaben (MP), propylparaben (PP) and sorbic acid (SA) are food preservatives, and they have well defined UV spectra. However, their spectra overlap seriously, and it is difficult to determine them individually from their mixtures without preseparation. In this paper, seven different chemometric approaches were applied to resolve the overlapping spectra and to determine these compounds simultaneously. With respect to the criteria of % relative prediction error (RPE) and % recovery, principal component regression (PCR) and radial basis function-artificial neural network (RBF-ANN) were the preferred methods. These two methods were successfully applied to the analysis of some commercial samples.
文摘In addition to the conventional methods of the calibration model construction, such as PCR (principal components regression) and PLS (partial least-squares), a MPM (mathematical programming method) is developed and proposed for practical use in NIR analyses of agricultural and food products. The proposed method involves the mathematical programming techniques to seek the regression coefficients for the calibration model calculation. It is based on the optimization theory used for finding the extremum of the objective function in the given domain of a vector space and employs the method of the complementarity problems solving. The MPM algorithm is described in detail. The MPM was tested on an InfraLUM FT-10 NIR analyzer of Lumex company with samples of dry milk (for fat), corn (for protein) and rye flour (for moisture). The obtained results show that the MPM can be used for constructing multivariate calibrations with the qualitative characteristics superior over those of the classical PCR and PLS methods of analysis.
基金Supported by Foundation of Tianjin City Science and Technology Project (No.09ZCKFSH00900)
文摘Corn steep liquor(CSL) is an important raw material that has high nutritional value and serves as a nitrogen source.Biotin in CSL is especially of great importance to fermentation.In order to develop a fast,versatile,cheap,and environmentally safe analytical method for quantifying vitamins B2(VB2),B3(VB3),B6(VB6) and B7(VB7) in CSL,the near-infrared spectroscopy(NIR) measurements of 66 samples(22 batches) of CSL were analyzed by partial least-square regression(PLSR).Multivariate models developed in the NIR regions showed good predictive abilities for VB2,VB3,VB6 and VB7.Results confirmed the probability of the multivariate spectroscopic approach as a replacement for expensive and time-consuming conventional chemical methods.
基金funding from the National Science Foundation (NSF) CAREER Award (CBET1151154)the National Aeronautics and Space Administration (NASA)Early Career Faculty Grant (NNX12AQ44G)+2 种基金Gulf of Mexico Research Initiative (GoMRI-030)Cullen College of Engineering at the University of Houstonthe MIT Laser Biomedical Research Center supported by the NIH National Center for Research Resources,Grant No.P41-RR02594.
文摘Multivariate calibration is an important tool for spectroscopic measurermnent of analyte con-centrations.We present a detailed study of a hybrid multivariate calibration technique,con-strained regularization(CR),and demonstrate its utility in noninvasive glucose sensing uasing Raman spectroscopy.Similar to partial least squares(PIS)and principal component regression(PCR),CR builds an implicit model and requires knowledge only of the concentrations of the analyte of interest.Calibration is treated as an inverse problem in which an optimal balance between model complexity and noise rejection is achieved.Prior information is included in the form of a spectroscopic constraint that can be obtained conveniently.When used with an appropriate constraint,CR provides a better calibration model compared to PLS in both numerical and experimental studies.
文摘Although near infrared (NIR) spectroscopy has been evaluated for numerous applications, the number of actual on-line or even on-site industrial applications seems to be very limited. In the present paper, the attempts to produce online predictions of the chemical oxygen demand (COD) in wastewater from a pulp and paper mill using NIR spectroscopy are described. The task was perceived as very challenging, but with a root mean square error of prediction of 149 mg/l, roughly corresponding to 1/10 of the studied concentration interval, this attempt was deemed as successful. This result was obtained by using partial least squares model regression, interpolated reference values for calibration purposes, and by evenly distributing the calibration data in the concentration space. This work may also represent the first industrial application of online COD measurements in wastewater using NIR spectroscopy.
基金Supported by the National Natural Science Foundation of China Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Academia Sinica
文摘A global optimum location algorithm called Variable Step-Size Generalized Simulated Annealing(VSGSA) was applied to treating the data obtained by using an array of ion-electrodes in solutions containing mixtures of Na+, K+, Ca2+. Unlike traditional optimization algorithms such as simplex procedure, VSGSA can be used to determine the model parameters without any priori information about the analytical system under investigation and overcome the disadvantage of simplex method which might converge to local extrema depending on the starting positions. The algorithm was applied to po-tentiometric determination of ions in mixture solutions.
基金the Fundamental Research Funds for the Central Universities (3101000-841413030)National Oceanic and Atmospheric Administration through grant NA11NOS0120033+2 种基金National National Science Foundation of China through grants 41506012, 41376001, 41206013, 41476047, 41430963, 41206004the support by National Aeronautics and Space Administration through grant NNX13AD80Gthe public science and technology research funds projects of ocean (201205018)
文摘A multivariate statistical downscaling method is developed to produce regional, high-resolution, coastal surface wind fields based on the IPCC global model predictions for the U.S. east coastal ocean, the Gulf of Mexico(GOM), and the Caribbean Sea. The statistical relationship is built upon linear regressions between the empirical orthogonal function(EOF) spaces of a cross- calibrated, multi-platform, multi-instrument ocean surface wind velocity dataset(predictand) and the global NCEP wind reanalysis(predictor) over a 10 year period from 2000 to 2009. The statistical relationship is validated before applications and its effectiveness is confirmed by the good agreement between downscaled wind fields based on the NCEP reanalysis and in-situ surface wind measured at 16 National Data Buoy Center(NDBC) buoys in the U.S. east coastal ocean and the GOM during 1992–1999. The predictand-predictor relationship is applied to IPCC GFDL model output(2.0?×2.5?) of downscaled coastal wind at 0.25?×0.25? resolution. The temporal and spatial variability of future predicted wind speeds and wind energy potential over the study region are further quantified. It is shown that wind speed and power would significantly be reduced in the high CO_2 climate scenario offshore of the mid-Atlantic and northeast U.S., with the speed falling to one quarter of its original value.
基金Project supported by the National Natural Science Foundation of China (No. 20835002), and International Science and Technology Cooperation Program of the Ministry of Science and Technology (MOST) of China (No. 2008DFA32250), as well as the British Columbia Innovation Council and the Natural Sciences and Engineering Research Council of Canada.
文摘The application of Raman spectroscopic techniques combined with multivariate chemometrics signal processing promise new means for the rapid multidimensional analysis of metabolites non-destructively, with little or no sample preparation and little sensitivity to water. However, Rayleigh scattering, fluorescence and uncontrolled variance present substantial challenges for the accurate quantitative analysis of metabolites at physiological levels in bio- logically varying samples. Effective strategies include the application of chemometrics pretreatments for reducing Raman spectral interference. However, the arbitrary application of individual or combined pretreatment procedures can significantly alter the outcome of a measurement, thereby complicating spectral analysis. This paper evaluates and compares six signal pretreatment methods for correcting the baseline variances, together with three variable se- lection methods for eliminating uninformative variables, all within the context of multivariate calibration models based on partial least squares (PLS) regression. Raman spectra of 90 artificial bio-fluid samples with eight urine metabolites at near-physiological concentrations were used to test these models. The combination of multiplicative scatter correction (MSC), continuous wavelet transform (CWT), randomization test (RT) and PLS modeling pre- sented the best performance for all the metabolites. The correlation coefficient (R) between predicted and prepared concentration reached as high as 0.96.
基金supported by the National Natural Science Foundation of China (20835002)
文摘Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared(NIR) spectroscopic analysis.A strategy for improving the performance of consensus methods in multivariate calibration of NIR spectra is proposed.In the approach,a subset of non-collinear variables is generated using successive projections algorithm(SPA) for each variable in the reduced spectra by uninformative variables elimination(UVE).Then sub-models are built using the variable subsets and the calibration subsets determined by Monte Carlo(MC) re-sampling,and the sub-model that produces minimal error in cross validation is selected as a member model.With repetition of the MC re-sampling,a series of member models are built and a consensus model is achieved by averaging all the member models.Since member models are built with the best variable subset and the randomly selected calibration subset,both the quality and the diversity of the member models are insured for the consensus model.Two NIR spectral datasets of tobacco lamina are used to investigate the proposed method.The superiority of the method in both accuracy and reliability is demonstrated.
基金Supported by the National Natural Science Foundation of China(No.21175074).
文摘Chlorinated paraffins(CPs) are potential persistent organic pollutants(POPs), which threat the safety of environment and organisms. However, the analysis of CPs is a difficult task due to their complex composition containing thousands of congeners. In the present work, quantitative structure retention relationship(QSRR) of CPs was studied. A total of 470 molecular descriptors were generated, for describing the structures of 28 CPs and 12 descriptors relevant to retention time of the CPs were selected by stepwise regression. Then, QSRR models between retention time on the one hand and the selected descriptors on the other hand were established by multiple linear regres- sion(MLR), partial least squares(PLS) and least square support vector regression(LS-SVR). The result shows that PLS model is better than MLR and LS-SVR, obtaining a squared correlation coefficient(r2) of 0.9996 and a root mean squared error(RMSE) of 0.015. The PLS model was then used to predict the retention time of 49 C10-CPs. Three of them were investigated by gas chromatography coupled with mass spectrometry(GC-MS). A well-defined correlation was found between the measured retention time and the predicted value.