Detection of fruit traits by using near-infrared(NIR)spectroscopy may encounter out-of-distribution samples that exceed the generalization ability of a constructed calibration model.Therefore,confidence analysis for a...Detection of fruit traits by using near-infrared(NIR)spectroscopy may encounter out-of-distribution samples that exceed the generalization ability of a constructed calibration model.Therefore,confidence analysis for a given prediction is required,but this cannot be done using common calibration models of NIR spectroscopy.To address this issue,this paper studied the Gaussian process regression(GPR)for fruit traits detection using NIR spectroscopy.The mean and variance of the GPR were used as the predicted value and confidence,respectively.To show this,a real NIR data set related to dry matter content measurements in mango was used.Compared to partial least squares regression(PLSR),GPR showed approximately 14%lower root mean squared error(RMSE)for the in-distribution test set.Compared with no confidence analysis,using the variance of GPR to remove abnormal samples made GPR and PLSR showed approximately 58%and 10%lower RMSE on the mixed distribution test set,respectively(when the type 1 error rate was set to 0.1).Compared with traditional one-class classification methods,the variance of the GPR can be used to effectively eliminate poorly predicted samples.展开更多
基金the National Natural Science Foundation of China(62105245)the Zhejiang Natural Science Foundation of China(LQ20F030059,and LY21C200001)the Wenzhou Science and Technology Bureau General Project(S2020011),China.
文摘Detection of fruit traits by using near-infrared(NIR)spectroscopy may encounter out-of-distribution samples that exceed the generalization ability of a constructed calibration model.Therefore,confidence analysis for a given prediction is required,but this cannot be done using common calibration models of NIR spectroscopy.To address this issue,this paper studied the Gaussian process regression(GPR)for fruit traits detection using NIR spectroscopy.The mean and variance of the GPR were used as the predicted value and confidence,respectively.To show this,a real NIR data set related to dry matter content measurements in mango was used.Compared to partial least squares regression(PLSR),GPR showed approximately 14%lower root mean squared error(RMSE)for the in-distribution test set.Compared with no confidence analysis,using the variance of GPR to remove abnormal samples made GPR and PLSR showed approximately 58%and 10%lower RMSE on the mixed distribution test set,respectively(when the type 1 error rate was set to 0.1).Compared with traditional one-class classification methods,the variance of the GPR can be used to effectively eliminate poorly predicted samples.