摘要
为了提高分析模型的效率与性能,提出了一种基于变量稳定性与集群分析相结合(VSPA)的波长选择方法。该算法将变量分为样本空间与变量空间,在样本空间里计算变量的稳定性,根据稳定性值,利用加权自举采样技术将变量划分为有用变量与无用变量;在变量空间中,统计每个变量出现的频率,利用指数衰减函数在无用变量中去掉变量频率较低的变量。将算法应用在近红外光谱玉米数据集中来预测玉米中淀粉的含量,其预测集均方根(RMSEP)与相关系数(R_p)分别为0.0409和0.9974,筛选后的特征变量仅为原始光谱数据的2.7%,说明提出的变量选择方法能够提高模型的运算效率与预测能力,是一种有效的变量选择方法。
In order to improve the efficiency and performance of the analysis model,a wavelength selection method based on variable stability and population analysis(VSPA)is proposed.Firstly,the variables are divided into sample space and variable space,and the stability of variables is calculated in the sample space.According to the stability value,the variables are divided into useful variables and useless variables by weighted bootstrap sampling technology.Then,in the variable space,the frequency of each variable is calculated,and the exponential decline function is used to remove the variables with lower frequency from the useless variables.Finally,the proposed algorithm is applied to corn NIR data set to predict the starch content.The predicted root root-mean-square(RMSEP)and predicted correlation coefficient(RP)is 0.0409 and 0.9974,respectively.The variables after selection are only 2.7%of the original spectral data.It shows that the proposed variable selection method can improve the operational efficiency and prediction accuracy of the model,and is proved to be an effective variable selection method.
作者
张峰
汤晓君
仝昂鑫
王斌
王经纬
ZHANG Feng;TANG Xiao-Jun;TONG Ang-Xin;WANG Bin;WANG Jing-Wei(Xi’an Jiaotong University State Key Laboratory of Electrical Insulation and Power Equipment,Xi’an 710049,China)
出处
《红外与毫米波学报》
SCIE
EI
CAS
CSCD
北大核心
2020年第3期318-323,共6页
Journal of Infrared and Millimeter Waves
基金
国家重点研发计划(2016YFF0102805)。
关键词
波长选择
加权自举采样
近红外光谱
偏最小二乘
wavelength selection
weighted bootstrap sampling
near infrared spectral
partial least squares