Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used parti...Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used partial least square regression (PLSR) models to relate forest parameters, i.e. canopy closure density and above ground tree biomass, to Landsat ETM+ data. The established models were optimized according to the variable importance for projection (VIP) criterion and the bootstrap method, and their performance was compared using several statistical indices. All variables selected by the VIP criterion passed the bootstrap test (p〈0.05). The simplified models without insignificant variables (VIP 〈1) performed as well as the full model but with less computation time. The relative root mean square error (RMSE%) was 29% for canopy closure density, and 58% for above ground tree biomass. We conclude that PLSR can be an effective method for estimating canopy closure density and above ground biomass.展开更多
When the total least squares(TLS)solution is used to solve the parameters in the errors-in-variables(EIV)model,the obtained parameter estimations will be unreliable in the observations containing systematic errors.To ...When the total least squares(TLS)solution is used to solve the parameters in the errors-in-variables(EIV)model,the obtained parameter estimations will be unreliable in the observations containing systematic errors.To solve this problem,we propose to add the nonparametric part(systematic errors)to the partial EIV model,and build the partial EIV model to weaken the influence of systematic errors.Then,having rewritten the model as a nonlinear model,we derive the formula of parameter estimations based on the penalized total least squares criterion.Furthermore,based on the second-order approximation method of precision estimation,we derive the second-order bias and covariance of parameter estimations and calculate the mean square error(MSE).Aiming at the selection of the smoothing factor,we propose to use the U curve method.The experiments show that the proposed method can mitigate the influence of systematic errors to a certain extent compared with the traditional method and get more reliable parameter estimations and its precision information,which validates the feasibility and effectiveness of the proposed method.展开更多
Scientific forecasting water yield of mine is of great significance to the safety production of mine and the colligated using of water resources. The paper established the forecasting model for water yield of mine, co...Scientific forecasting water yield of mine is of great significance to the safety production of mine and the colligated using of water resources. The paper established the forecasting model for water yield of mine, combining neural network with the partial least square method. Dealt with independent variables by the partial least square method, it can not only solve the relationship between independent variables but also reduce the input dimensions in neural network model, and then use the neural network which can solve the non-linear problem better. The result of an example shows that the prediction has higher precision in forecasting and fitting.展开更多
The Laser Induced Breakdown Spectroscopy (LIBS) is a fast, non-contact, no sample preparation analytic technology;it is very suitable for on-line analysis of alloy composition. In the copper smelting industry, analysi...The Laser Induced Breakdown Spectroscopy (LIBS) is a fast, non-contact, no sample preparation analytic technology;it is very suitable for on-line analysis of alloy composition. In the copper smelting industry, analysis and control of the copper alloy concentration affect the quality of the products greatly, so LIBS is an efficient quantitative analysis tech- nology in the copper smelting industry. But for the lead brass, the components of Pb, Al and Ni elements are very low and the atomic emission lines are easily submerged under copper complex characteristic spectral lines because of the matrix effects. So it is difficult to get the online quantitative result of these important elements. In this paper, both the partial least squares (PLS) method and the calibration curve (CC) method are used to quantitatively analyze the laser induced breakdown spectroscopy data which is obtained from the standard lead brass alloy samples. Both the major and trace elements were quantitatively analyzed. By comparing the two results of the different calibration method, some useful results were obtained: both for major and trace elements, the PLS method was better than the CC method in quantitative analysis. And the regression coefficient of PLS method is compared with the original spectral data with background interference to explain the advantage of the PLS method in the LIBS quantitative analysis. Results proved that the PLS method used in laser induced breakdown spectroscopy was suitable for simultaneous quantitative analysis of different content elements in copper smelting industry.展开更多
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of hea...Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.展开更多
In 2013, Chang'E-3 program will develop lunar mineral resources in-situ detection. A Visible and Near-infrared Imaging Spectrometer(VNIS) has been selected as one payload of CE-3 lunar rover to achieve this goal. ...In 2013, Chang'E-3 program will develop lunar mineral resources in-situ detection. A Visible and Near-infrared Imaging Spectrometer(VNIS) has been selected as one payload of CE-3 lunar rover to achieve this goal. It is critical and urgent to evaluate VNIS' spectrum data quality and validate quantification methods for mineral composition before its launch. Ground validation experiment of VNIS was carried out to complete the two goals, by simulating CE-3 lunar rover's detection environment on lunar surface in the laboratory. Based on the hyperspectral reflectance data derived, Correlation Analysis and Partial Least Square(CA-PLS) algorithm is applied to predict abundance of four lunar typical minerals(pyroxene, plagioclase, ilmenite and olivine) in their mixture. We firstly selected a set of VNIS' spectral parameters which highly correlated with minerals' abundance by correlation analysis(CA), and then stepwise regression method was used to find out spectral parameters which make the largest contributions to the mineral contents. At last, functions were derived to link minerals' abundance and spectral parameters by partial least square(PLS) algorithm. Not considering the effect of maturity, agglutinate and Fe0, we found that there are wonderful correlations between these four minerals and VNIS' spectral parameters, e.g. the abundance of pyroxene correlates positively with the mixture's absorption depth, the value of absorption depth added as the increasing of pyroxene's abundance. But the abundance of plagioclase correlates negatively with the spectral parameters of band ratio, the value of band ratio would decrease when the abundance of plagioclase increased. Similar to plagioclase, the abundance of ilmenite and olivine has a negative correlation with the mixture's reflectance data, if the abundance of ilmenite or olivine increase, the reflectance values of the mixture will decrease. Through model validation, better estimates of pyroxene, plagioclase and ilmenite's abundances are given. It is concluded that VNIS has the capability to be applied on lunar minerals' identification, and CA-PLS algorithm has the potential to be used on lunar surface's in-situ detection for minerals' abundance prediction.展开更多
Rate of penetration(ROP) of a tunnel boring machine(TBM) in a rock environment is generally a key parameter for the successful accomplishment of a tunneling project. The objectives of this work are to compare the accu...Rate of penetration(ROP) of a tunnel boring machine(TBM) in a rock environment is generally a key parameter for the successful accomplishment of a tunneling project. The objectives of this work are to compare the accuracy of prediction models employing partial least squares(PLS) regression and support vector machine(SVM) regression technique for modeling the penetration rate of TBM. To develop the proposed models, the database that is composed of intact rock properties including uniaxial compressive strength(UCS), Brazilian tensile strength(BTS), and peak slope index(PSI), and also rock mass properties including distance between planes of weakness(DPW) and the alpha angle(α) are input as dependent variables and the measured ROP is chosen as an independent variable. Two hundred sets of data are collected from Queens Water Tunnel and Karaj-Tehran water transfer tunnel TBM project. The accuracy of the prediction models is measured by the coefficient of determination(R2) and root mean squares error(RMSE) between predicted and observed yield employing 10-fold cross-validation schemes. The R2 and RMSE of prediction are 0.8183 and 0.1807 for SVMR method, and 0.9999 and 0.0011 for PLS method, respectively. Comparison between the values of statistical parameters reveals the superiority of the PLSR model over SVMR one.展开更多
The box office during the later Spring Festival shows an attractive prospect.This paper studied the factors affecting total box office during the broad Spring Festival which is from the Spring Festival to the Lantern ...The box office during the later Spring Festival shows an attractive prospect.This paper studied the factors affecting total box office during the broad Spring Festival which is from the Spring Festival to the Lantern Festival.Data of films released during the broad Spring Festival from the years 2016 to 2019 in China were gathered,and the impact of eight explanatory variables on the box office during the broad Spring Festival was empirically analyzed by partial least squares(PLS)regression with software SIMCA.The results suggest that word-of-mouth has the most positive effect on the box office during the broad Spring Festival.Later propaganda has a positive effect,while early promotion has a negative effect on the box office.Director’s influence has a positive effect,while actor’s influence does not contribute much to the box office.Length of the trailer has a negative effect.The film format of 2D or 3D doesn’t contribute much to the box office.展开更多
Boosting algorithms are a class of general methods used to improve the general periormance of regression analysis. The main idea is to maintain a distribution over the train set. In order to use the given distribution...Boosting algorithms are a class of general methods used to improve the general periormance of regression analysis. The main idea is to maintain a distribution over the train set. In order to use the given distribution directly, a modified PLS algorithm is proposed and used as the base learner to deal with the nonlinear multivariate regression problems. Experiments on gasoline octane number prediction demonstrate that boosting the modified PLS algorithm has better general performance over the PLS algorithm.展开更多
Accurately approximating higher order derivatives is an inherently difficult problem. It is shown that a random variable shape parameter strategy can improve the accuracy of approximating higher order derivatives with...Accurately approximating higher order derivatives is an inherently difficult problem. It is shown that a random variable shape parameter strategy can improve the accuracy of approximating higher order derivatives with Radial Basis Function methods. The method is used to solve fourth order boundary value problems. The use and location of ghost points are examined in order to enforce the extra boundary conditions that are necessary to make a fourth-order problem well posed. The use of ghost points versus solving an overdetermined linear system via least squares is studied. For a general fourth-order boundary value problem, the recommended approach is to either use one of two novel sets of ghost centers introduced here or else to use a least squares approach. When using either ghost centers or least squares, the random variable shape parameter strategy results in significantly better accuracy than when a constant shape parameter is used.展开更多
Complex industrial process often contains multiple operating modes, and the challenge of multimode process monitoring has recently gained much attention. However, most multivariate statistical process monitoring (MSPM...Complex industrial process often contains multiple operating modes, and the challenge of multimode process monitoring has recently gained much attention. However, most multivariate statistical process monitoring (MSPM) methods are based on the assumption that the process has only one nominal mode. When the process data contain different distributions, they may not function as well as in single mode processes. To address this issue, an improved partial least squares (IPLS) method was proposed for multimode process monitoring. By utilizing a novel local standardization strategy, the normal data in multiple modes could be centralized after being standardized and the fundamental assumption of partial least squares (PLS) could be valid again in multimode process. In this way, PLS method was extended to be suitable for not only single mode processes but also multimode processes. The efficiency of the proposed method was illustrated by comparing the monitoring results of PLS and IPLS in Tennessee Eastman(TE) process.展开更多
基金supported by the 948 Program of the State Forestry Administration (2009-4-43)the National Natura Science Foundation of China (No.30870420)
文摘Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used partial least square regression (PLSR) models to relate forest parameters, i.e. canopy closure density and above ground tree biomass, to Landsat ETM+ data. The established models were optimized according to the variable importance for projection (VIP) criterion and the bootstrap method, and their performance was compared using several statistical indices. All variables selected by the VIP criterion passed the bootstrap test (p〈0.05). The simplified models without insignificant variables (VIP 〈1) performed as well as the full model but with less computation time. The relative root mean square error (RMSE%) was 29% for canopy closure density, and 58% for above ground tree biomass. We conclude that PLSR can be an effective method for estimating canopy closure density and above ground biomass.
基金supported by the National Natural Science Foundation of China,Nos.41874001 and 41664001Support Program for Outstanding Youth Talents in Jiangxi Province,No.20162BCB23050National Key Research and Development Program,No.2016YFB0501405。
文摘When the total least squares(TLS)solution is used to solve the parameters in the errors-in-variables(EIV)model,the obtained parameter estimations will be unreliable in the observations containing systematic errors.To solve this problem,we propose to add the nonparametric part(systematic errors)to the partial EIV model,and build the partial EIV model to weaken the influence of systematic errors.Then,having rewritten the model as a nonlinear model,we derive the formula of parameter estimations based on the penalized total least squares criterion.Furthermore,based on the second-order approximation method of precision estimation,we derive the second-order bias and covariance of parameter estimations and calculate the mean square error(MSE).Aiming at the selection of the smoothing factor,we propose to use the U curve method.The experiments show that the proposed method can mitigate the influence of systematic errors to a certain extent compared with the traditional method and get more reliable parameter estimations and its precision information,which validates the feasibility and effectiveness of the proposed method.
基金Supported by "863" Program of P. R. China(2002AA2Z4291)
文摘Scientific forecasting water yield of mine is of great significance to the safety production of mine and the colligated using of water resources. The paper established the forecasting model for water yield of mine, combining neural network with the partial least square method. Dealt with independent variables by the partial least square method, it can not only solve the relationship between independent variables but also reduce the input dimensions in neural network model, and then use the neural network which can solve the non-linear problem better. The result of an example shows that the prediction has higher precision in forecasting and fitting.
文摘The Laser Induced Breakdown Spectroscopy (LIBS) is a fast, non-contact, no sample preparation analytic technology;it is very suitable for on-line analysis of alloy composition. In the copper smelting industry, analysis and control of the copper alloy concentration affect the quality of the products greatly, so LIBS is an efficient quantitative analysis tech- nology in the copper smelting industry. But for the lead brass, the components of Pb, Al and Ni elements are very low and the atomic emission lines are easily submerged under copper complex characteristic spectral lines because of the matrix effects. So it is difficult to get the online quantitative result of these important elements. In this paper, both the partial least squares (PLS) method and the calibration curve (CC) method are used to quantitatively analyze the laser induced breakdown spectroscopy data which is obtained from the standard lead brass alloy samples. Both the major and trace elements were quantitatively analyzed. By comparing the two results of the different calibration method, some useful results were obtained: both for major and trace elements, the PLS method was better than the CC method in quantitative analysis. And the regression coefficient of PLS method is compared with the original spectral data with background interference to explain the advantage of the PLS method in the LIBS quantitative analysis. Results proved that the PLS method used in laser induced breakdown spectroscopy was suitable for simultaneous quantitative analysis of different content elements in copper smelting industry.
基金the Hi-Tech Research and Development Program (863) of China (No. 2006AA10Z203)the National Scienceand Technology Task Force Project (No. 2006BAD10A01), China
文摘Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.
基金financially supported by the Chang’E program of China (NO.TY3Q20110029)Knowledge Innovation Program of the Chinese Academy of Sciences (Grant No.KGCX2-EW-402)National Natural Science Foundation of China (Nos.11003012 and U1231103)
文摘In 2013, Chang'E-3 program will develop lunar mineral resources in-situ detection. A Visible and Near-infrared Imaging Spectrometer(VNIS) has been selected as one payload of CE-3 lunar rover to achieve this goal. It is critical and urgent to evaluate VNIS' spectrum data quality and validate quantification methods for mineral composition before its launch. Ground validation experiment of VNIS was carried out to complete the two goals, by simulating CE-3 lunar rover's detection environment on lunar surface in the laboratory. Based on the hyperspectral reflectance data derived, Correlation Analysis and Partial Least Square(CA-PLS) algorithm is applied to predict abundance of four lunar typical minerals(pyroxene, plagioclase, ilmenite and olivine) in their mixture. We firstly selected a set of VNIS' spectral parameters which highly correlated with minerals' abundance by correlation analysis(CA), and then stepwise regression method was used to find out spectral parameters which make the largest contributions to the mineral contents. At last, functions were derived to link minerals' abundance and spectral parameters by partial least square(PLS) algorithm. Not considering the effect of maturity, agglutinate and Fe0, we found that there are wonderful correlations between these four minerals and VNIS' spectral parameters, e.g. the abundance of pyroxene correlates positively with the mixture's absorption depth, the value of absorption depth added as the increasing of pyroxene's abundance. But the abundance of plagioclase correlates negatively with the spectral parameters of band ratio, the value of band ratio would decrease when the abundance of plagioclase increased. Similar to plagioclase, the abundance of ilmenite and olivine has a negative correlation with the mixture's reflectance data, if the abundance of ilmenite or olivine increase, the reflectance values of the mixture will decrease. Through model validation, better estimates of pyroxene, plagioclase and ilmenite's abundances are given. It is concluded that VNIS has the capability to be applied on lunar minerals' identification, and CA-PLS algorithm has the potential to be used on lunar surface's in-situ detection for minerals' abundance prediction.
基金Project(2010CB732004)supported by the National Basic Research Program of ChinaProjects(50934006,41272304)supported by the National Natural Science Foundation of China
文摘Rate of penetration(ROP) of a tunnel boring machine(TBM) in a rock environment is generally a key parameter for the successful accomplishment of a tunneling project. The objectives of this work are to compare the accuracy of prediction models employing partial least squares(PLS) regression and support vector machine(SVM) regression technique for modeling the penetration rate of TBM. To develop the proposed models, the database that is composed of intact rock properties including uniaxial compressive strength(UCS), Brazilian tensile strength(BTS), and peak slope index(PSI), and also rock mass properties including distance between planes of weakness(DPW) and the alpha angle(α) are input as dependent variables and the measured ROP is chosen as an independent variable. Two hundred sets of data are collected from Queens Water Tunnel and Karaj-Tehran water transfer tunnel TBM project. The accuracy of the prediction models is measured by the coefficient of determination(R2) and root mean squares error(RMSE) between predicted and observed yield employing 10-fold cross-validation schemes. The R2 and RMSE of prediction are 0.8183 and 0.1807 for SVMR method, and 0.9999 and 0.0011 for PLS method, respectively. Comparison between the values of statistical parameters reveals the superiority of the PLSR model over SVMR one.
基金Communication University of China Foundation,China(No.CUC18A015-2)Fundamental Research Funds for the Central Universities,China(No.CUC200D036)
文摘The box office during the later Spring Festival shows an attractive prospect.This paper studied the factors affecting total box office during the broad Spring Festival which is from the Spring Festival to the Lantern Festival.Data of films released during the broad Spring Festival from the years 2016 to 2019 in China were gathered,and the impact of eight explanatory variables on the box office during the broad Spring Festival was empirically analyzed by partial least squares(PLS)regression with software SIMCA.The results suggest that word-of-mouth has the most positive effect on the box office during the broad Spring Festival.Later propaganda has a positive effect,while early promotion has a negative effect on the box office.Director’s influence has a positive effect,while actor’s influence does not contribute much to the box office.Length of the trailer has a negative effect.The film format of 2D or 3D doesn’t contribute much to the box office.
基金This work was supported by the National High-tech Research and Development Program of China (No. 2003AA412110).
文摘Boosting algorithms are a class of general methods used to improve the general periormance of regression analysis. The main idea is to maintain a distribution over the train set. In order to use the given distribution directly, a modified PLS algorithm is proposed and used as the base learner to deal with the nonlinear multivariate regression problems. Experiments on gasoline octane number prediction demonstrate that boosting the modified PLS algorithm has better general performance over the PLS algorithm.
文摘Accurately approximating higher order derivatives is an inherently difficult problem. It is shown that a random variable shape parameter strategy can improve the accuracy of approximating higher order derivatives with Radial Basis Function methods. The method is used to solve fourth order boundary value problems. The use and location of ghost points are examined in order to enforce the extra boundary conditions that are necessary to make a fourth-order problem well posed. The use of ghost points versus solving an overdetermined linear system via least squares is studied. For a general fourth-order boundary value problem, the recommended approach is to either use one of two novel sets of ghost centers introduced here or else to use a least squares approach. When using either ghost centers or least squares, the random variable shape parameter strategy results in significantly better accuracy than when a constant shape parameter is used.
基金National Natural Science Foundation of China ( No. 61074079) Shanghai Leading Academic Discipline Project,China ( No.B504)
文摘Complex industrial process often contains multiple operating modes, and the challenge of multimode process monitoring has recently gained much attention. However, most multivariate statistical process monitoring (MSPM) methods are based on the assumption that the process has only one nominal mode. When the process data contain different distributions, they may not function as well as in single mode processes. To address this issue, an improved partial least squares (IPLS) method was proposed for multimode process monitoring. By utilizing a novel local standardization strategy, the normal data in multiple modes could be centralized after being standardized and the fundamental assumption of partial least squares (PLS) could be valid again in multimode process. In this way, PLS method was extended to be suitable for not only single mode processes but also multimode processes. The efficiency of the proposed method was illustrated by comparing the monitoring results of PLS and IPLS in Tennessee Eastman(TE) process.