摘要
移动窗口平滑集成CARS算法(MWS-ECARS)是一种稳定的特征变量提取算法。在前人研究的基础上,提出了两种基于不同窗口平滑算法改进的MWS-ECARS对红茶光谱降维,并与原始的MWS-ECARS、常用的连续投影算法(SPA)、竞争性自适应重加权算法(CARS)、移动窗口偏最小二乘法(MWPLS)比较,建立偏最小二乘算法回归模型(PLSR),选择出最优红茶等级判别模型。两种改进的MWS-ECARS方法分别是窗口高斯滤波平滑集成CARS(gaussian filter ECARS, GF-ECARS)、窗口中值滤波平滑集成CARS(median filter ECARS, MF-ECARS)。CARS算法运行n次(该研究n=1 000),整合波长及其对应的挑选频率并用不同的窗口平滑算法对挑选频率进行平滑,窗口宽度均为3~31,窗口步长均为2;将通过不同窗口宽度和平滑算法平滑过的挑选频率进行阈值的设定,起始阈值及步长均为20;最后选择出挑选频率大于阈值的波长,建立PLSR模型,以预测集相关系数(RP^2)为判断因子,RP^2越接近1,说明建立的模型预测能力更为准确。结果表明:改进后的GF-ECARS算法提取的特征变量建立红茶等级判别模型的结果最好,RP^2达到0.969 2。原因是在窗口高斯滤波平滑算法中,随着窗口宽度增大,其曲线上各点的振幅差距会变小。在高斯算法加权平均的过程中,不容易出现将低频的波长与高的权值相联系。在实际应用中,往往会出现有效波段的挑选频率较低的情况,可以通过选择窄窗口宽度的高斯滤波对其进行平滑。另外,高斯曲线的特征能使高斯滤波很好的保护窗口边缘图像的细节。虽然MF-ECARS算法的建模结果比原始MWS-ECARS略差,但其RP^2仍然达到了0.96以上,表明改进后的算法能提高原始模型的预测能力。不同窗口平滑算法的MWS-ECARS提取特征变量不同,但随着平滑窗口宽度的增加,特征变量区间连续性都在增强,数目均在减少。三种MWS-ECARS算法的预测集相关系数都显示出它们比常用的SPA, CARS和MWPLS三种降维算法更有效,更稳定。为光谱数据的选择性降维算法研究提供参考。
Moving window smoothing ensemble CARS(MWS-ECARS) is a stable algorithm for extracting characteristic variables. Based on the previous studies, two improved MWS-ECARS are proposed to reduce the dimension of black tea spectrum based on different window smoothing algorithms in this paper, and compared with the original MWS-ECARS, the commonly used successive projections algorithm(SPA), the competitive adaptive reweighting algorithm(CARS) and the moving window partial least squares method(MWPLS). A partial least square regression model(PLSR) was established to select the best black tea grade discrimination model. Two improved MWS-ECARS methods are Gaussian filter ECARS(GF-ECARS) and Median filter smoothing ECARS(MF-ECARS), respectively. The CARS algorithm runs n times(n=1 000 in this paper). The wavelength and its corresponding selected frequency are sorted out and different window smoothing algorithms are used to smooth the selection frequency. The window widths are all 3~31, and the window step sizes are all 2. The threshold is set through the selection frequency smoothed by different window widths and smoothing algorithm, and the starting threshold and step size are both 20. Finally, the wavelength whose selection frequency is higher than the threshold is selected and the PLSR model is established. The correlation coefficient of prediction set(RP^2) is taken as the judgment factor. The closer RP^2 is to 1, the more accurate the established model is. The results show that the black tea grade discrimination model established by the extracted characteristic variables with the improved GF-ECARS algorithm is the best. The RP^2 reaches 0.969 2. The reason is that the amplitude difference of each point on the curve will become smaller in the window Gaussian filtering smoothing algorithm as the window width increases. In the weighted average process of Gaussian algorithm, it is not easy to associate the low frequency wavelength with the high weight. In practical applications, the selection frequency of effective band is often low, which can be smoothed by selecting a Gaussian filter with narrow window width. In addition, due to the characteristics of the Gaussian curve, the Gaussian filtering algorithm can well protect the details of the window edge image. Although the modeling result of the MF-ECARS algorithm is slightly worse than the original MWS-ECARS, its RP^2 still reaches over 0.96. This shows that the improved algorithm can improve the prediction ability of the original model. MWS-ECARS extraction feature variables are different based on different window smoothing algorithms. However, as the smoothing window width increases, the continuity of the extracted characteristic variables is enhanced and the number of extracted characteristic variables is reduced. The RP^2 of the three MWS-ECARS algorithms all show that they are more effective and stable than the commonly SPA, CARS and MWPRS algorithms. This study can provide ideas for selective dimensionality reduction of spectral data.
作者
袁荔
施斌
于建成
唐天宇
袁园
唐延林
YUAN Li;SHI Bin;YU Jian-cheng;TANG Tian-yu;YUAN Yuan;TANG Yan-lin(School of Physics,Guizhou University,Guiyang 550025,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2020年第10期3254-3259,共6页
Spectroscopy and Spectral Analysis
基金
国家自然科学基金项目(11164004,11864006)
贵州省光子科学与技术创新人才团队项目(20154017)资助。
关键词
移动窗口平滑集成CARS
可见-近红外光谱
红茶
等级
Moving window smoothing ensemble CARS(MWS-ECARS)
Visible-near infrared spectroscopy
Black tea
Grades