摘要
介绍了近红外光谱 PCR/PL S建模时 ,训练集中异常样品的危害 ,以及剔除异常样品常用的基于预测浓度残差准则。针对剔除异常样品的“一审”法具有将非异常样品错误地当作异常样品的局限性 ,提出了一种“二审”法 ,采用“回收”算子 ,使最终模型保留了更多的样品 ,使模型更具有代表性和稳定性 ,进一步提高通过近红外光谱模型进行农产品品质检测的精度。
A new algorithm was developed in this paper, which could be used for detecting and eliminating outlier samples in building a PCR or PLS calibration model of NIR (near infrared spectroscopy). Because of the existence of outlier samples in training sets, the predicted concentrations of valid samples are less accurate and even inaccurate. It is very important to detect the outlier samples and eliminate them from training sets. One usual criterion of outlier detecting is based on predicted concentration residuals and samples that have significantly larger concentration residuals than the rest of the training sets are concentration outlier samples. Some non-outlier samples are often falsely regarded as outlier samples when using the once-detect (OD) method. Instead of the OD method, a new method using callback arithmetic operator (CAO), named twice-detect (TD), was developed. Using the TD method, more effective samples will be kept in final model than using OD method; this made the final model to be more representative, stable and robust.
出处
《农业机械学报》
EI
CAS
CSCD
北大核心
2004年第4期115-119,共5页
Transactions of the Chinese Society for Agricultural Machinery
基金
国家计委高技术产业化示范项目(项目编号:计高技[2001]561号)
国家"九五"重点科技攻关项目(项目编号:990100112)
西南农业大学博士启动基金(2003)资助项目