摘要
介绍了目前常见的几种定标集样品挑选方法:含量梯度法、Duplex法和Kennard-Stone法,并提出了新的挑选方法:GN距离法。该方法以全局距离来界定定标集样品范围,以邻域距离来剔除相似样品,根据不同的全局和邻域距离组合挑选出定标集样品建模,根据计算所得最小交叉验证误差SECV来确定最合理的定标集样品。通过实例,讨论比较了上述4种方法优缺点。结果表明:GN距离法能够在保持原始样品集覆盖范围的基础上,适量剔除异常样品,GN距离法挑选出的定标集所建模型具有较低的模型复杂度、较高的相关系数和较好的模型预测能力。
Lots of samples are needed to build NIR models of agricultural products. Selection of representative samples for calibration can directly influence the representative and accurateness of the model and reduce the workload of building models. Some algorithms to select samples for calibration were introduced; they were Duplex design and Kennard-Stone design based on the grads of samples' concentration. A new algorithm named GN distance was presented. The above 4 methods have been compared with the example. The result indicated that the GN distance method could keep the concentration range of origin set and delete some outliers from the origin set ; the calibration set selected by the method could reduce the model's complexity and the model has higher R , lower SECV and SEP.
出处
《农业机械学报》
EI
CAS
CSCD
北大核心
2006年第4期80-82,101,共4页
Transactions of the Chinese Society for Agricultural Machinery
基金
国家"863"高技术研究发展计划资助项目(项目编号:2003AA209012)
关键词
定标集
近红外
样品挑选
全局距离
邻域距离
Calibration set, NIR, Sample selection, Global H, Neighborhood H