[Objectives] This study was conducted to solve the problems of complex near-infrared spectrum information of soybean lysine, serious collinearity and insufficient predictive ability of full-spectrum modeling. [Methods...[Objectives] This study was conducted to solve the problems of complex near-infrared spectrum information of soybean lysine, serious collinearity and insufficient predictive ability of full-spectrum modeling. [Methods] A new variable selection method, i.e., variable combination model population analysis method, was used to select characteristic wavelengths of soybean lysine near infrared spectra. The binary matrix sampling strategy and exponential decay function were used at first to delete the variables providing no information and select the near-infrared characteristic wavelengths of soybean lysine, which were then combined the partial least square method to establish a prediction model. Compared with other variable selection methods, the Monte Carlo variable combination model population analysis method selected the least wavelength points and the model had the strongest predictive ability. The variable combination model population analysis method adopting the binary matrix sampling strategy made up for the shortcomings of the single Monte Carlo sampling method. [Results] The experimental results showed that the Monte Carlo variable combination model population analysis algorithm could better select the characteristic wavelengths of soybean lysine NIR spectra and improve the reliability of the prediction model. However, in general, the accuracy of the lysine prediction model is not satisfactory, and it needs to be further reconstructed and optimized in future research work. The reason might be that the determination accuracy of the chemical value of lysine content was insufficient, or it might be caused by the poor absorption of the hydrogen-containing group of lysine in the near-infrared spectrum region and the poor correlation with proteins. [Conclusions] This study provides a reference for soybean high-lysine breeding.展开更多
基金Supported by Agricultural Development Fund Plan of Chongqing Academy of Agricultural Sciences(NKY-2020AC008)Project of Chongqing Science and Technology Bureau(Ycstc,2019cc0101,CQYC201903216,Ycstc,2020ac1102,cstc2019jscx-gksbX0138)+1 种基金National Agricultural Science Germplasm Resources Jiangjin Observation and Experimental StationChongqing Grain and Oil Crop Field Scientific Observation and Research Station。
文摘[Objectives] This study was conducted to solve the problems of complex near-infrared spectrum information of soybean lysine, serious collinearity and insufficient predictive ability of full-spectrum modeling. [Methods] A new variable selection method, i.e., variable combination model population analysis method, was used to select characteristic wavelengths of soybean lysine near infrared spectra. The binary matrix sampling strategy and exponential decay function were used at first to delete the variables providing no information and select the near-infrared characteristic wavelengths of soybean lysine, which were then combined the partial least square method to establish a prediction model. Compared with other variable selection methods, the Monte Carlo variable combination model population analysis method selected the least wavelength points and the model had the strongest predictive ability. The variable combination model population analysis method adopting the binary matrix sampling strategy made up for the shortcomings of the single Monte Carlo sampling method. [Results] The experimental results showed that the Monte Carlo variable combination model population analysis algorithm could better select the characteristic wavelengths of soybean lysine NIR spectra and improve the reliability of the prediction model. However, in general, the accuracy of the lysine prediction model is not satisfactory, and it needs to be further reconstructed and optimized in future research work. The reason might be that the determination accuracy of the chemical value of lysine content was insufficient, or it might be caused by the poor absorption of the hydrogen-containing group of lysine in the near-infrared spectrum region and the poor correlation with proteins. [Conclusions] This study provides a reference for soybean high-lysine breeding.