Statistical models can efficiently establish the relationships between crop growth and environmental conditions while explicitly quantifying uncertainties. This study aimed to test the efficiency of statistical models...Statistical models can efficiently establish the relationships between crop growth and environmental conditions while explicitly quantifying uncertainties. This study aimed to test the efficiency of statistical models established using partial least squares regression(PLSR) and artificial neural network(ANN) in predicting seed yields of sunflower(Helianthus annuus). Two-year field trial data on sunflower growth under different salinity levels and nitrogen(N) application rates in the Yichang Experimental Station in Hetao Irrigation District, Inner Mongolia, China, were used to calibrate and validate the statistical models. The variable importance in projection score was calculated in order to select the sensitive crop indices for seed yield prediction. We found that when the most sensitive indices were used as inputs for seed yield estimation, the PLSR could attain a comparable accuracy(root mean square error(RMSE) = 0.93 t ha-1, coefficient of determination(R^2) = 0.69) to that when using all measured indices(RMSE = 0.81 t ha-1,R^2= 0.77). The ANN model outperformed the PLSR for yield prediction with different combinations of inputs of both microplots and field data. The results indicated that sunflower seed yield could be reasonably estimated by using a small number of crop characteristic indices under complex environmental conditions and management options(e.g., saline soils and N application). Since leaf area index and plant height were found to be the most sensitive crop indices for sunflower seed yield prediction, remotely sensed data and the ANN model may be joined for regional crop yield simulation.展开更多
A supersaturated design (SSD), whose run size is not enough for estimating all the main effects, is commonly used in screening experiments. It offers a potential useful tool to investigate a large number of factors ...A supersaturated design (SSD), whose run size is not enough for estimating all the main effects, is commonly used in screening experiments. It offers a potential useful tool to investigate a large number of factors with only a few experimental runs. The associated analysis methods have been proposed by many authors to identify active effects in situations where only one response is considered. However, there are often situations where two or more responses are observed simultaneously in one screening experiment, and the analysis of SSDs with multiple responses is thus needed. In this paper, we propose a two-stage variable selection strategy, called the multivariate partial least squares-stepwise regression (MPLS-SR) method, which uses the multivariate partial least squares regression in conjunction with the stepwise regression procedure to select true active effects in SSDs with multiple responses. Simulation studies show that the MPLS-SR method performs pretty good and is easy to understand and implement.展开更多
基金supported by the National Natural Science Foundation of China (Nos. 51609175, 51790533, 51879196, and 51439006)
文摘Statistical models can efficiently establish the relationships between crop growth and environmental conditions while explicitly quantifying uncertainties. This study aimed to test the efficiency of statistical models established using partial least squares regression(PLSR) and artificial neural network(ANN) in predicting seed yields of sunflower(Helianthus annuus). Two-year field trial data on sunflower growth under different salinity levels and nitrogen(N) application rates in the Yichang Experimental Station in Hetao Irrigation District, Inner Mongolia, China, were used to calibrate and validate the statistical models. The variable importance in projection score was calculated in order to select the sensitive crop indices for seed yield prediction. We found that when the most sensitive indices were used as inputs for seed yield estimation, the PLSR could attain a comparable accuracy(root mean square error(RMSE) = 0.93 t ha-1, coefficient of determination(R^2) = 0.69) to that when using all measured indices(RMSE = 0.81 t ha-1,R^2= 0.77). The ANN model outperformed the PLSR for yield prediction with different combinations of inputs of both microplots and field data. The results indicated that sunflower seed yield could be reasonably estimated by using a small number of crop characteristic indices under complex environmental conditions and management options(e.g., saline soils and N application). Since leaf area index and plant height were found to be the most sensitive crop indices for sunflower seed yield prediction, remotely sensed data and the ANN model may be joined for regional crop yield simulation.
基金supported by the National Natural Science Foundation of China (Grant Nos. 10971107, 11271205), the "131" Talents Program of Tianjin, and the Fundamental Research Funds for the Central Universities (Grant Nos. 65030011, 65011481).
文摘A supersaturated design (SSD), whose run size is not enough for estimating all the main effects, is commonly used in screening experiments. It offers a potential useful tool to investigate a large number of factors with only a few experimental runs. The associated analysis methods have been proposed by many authors to identify active effects in situations where only one response is considered. However, there are often situations where two or more responses are observed simultaneously in one screening experiment, and the analysis of SSDs with multiple responses is thus needed. In this paper, we propose a two-stage variable selection strategy, called the multivariate partial least squares-stepwise regression (MPLS-SR) method, which uses the multivariate partial least squares regression in conjunction with the stepwise regression procedure to select true active effects in SSDs with multiple responses. Simulation studies show that the MPLS-SR method performs pretty good and is easy to understand and implement.