摘要
偏最小二乘算法(Partial least squares,PLS)可以很好地解决分析数据中的变量共线性问题,在光谱分析,尤其是近/中红外及拉曼光谱的定量分析中应用广泛。针对PLS存在的有效信息提取和噪声抑制问题,提出一种变量聚类重加权的PLS算法。通过对光谱的各波数变量进行聚类并分别建模,然后集成为全谱模型。通过对计算并赋予各子类不同的权重,根据对模型的贡献对变量进行重加权,从而提高算法的预测精度。汽油中的辛烷值预测和烟草中的烟碱含量预测两组近红外数据验证表明,所提出算法优于经典的PLS算法,其RMSEP在两组数据中分别降低32%和22%,在光谱数据的定量分析中具有潜在的应用优势。
Due to the ability of overcoming both the dimensionality and the collinear problems of the spectral data, partial least squares ( PLS ) is in ever increasingly used for quantitative spectrometric analysis, especially for near-infrared spectrum, mid-infrared spectrum and Raman spectrum. In this work, an improved PLS algorithm is proposed for efficient information extraction and noise reduction. The spectral variables are clustering to several subsets, and several sub-models are built for each subset. Then, the sub-models are re-weighted and ensemble to the final model. Experiments on two near-infrared datasets ( octane number prediction in gasoline and nicotine prediction in tobacco leafs ) demonstrate that the new method provides superior prediction performance and outperformed the conventional PLS algorithm, and the root mean square error of prediction ( RMSEP) is reduced by 32% and 22%, respectively.
出处
《分析化学》
SCIE
EI
CAS
CSCD
北大核心
2015年第7期1086-1091,共6页
Chinese Journal of Analytical Chemistry
基金
国家自然科学基金(No.31473255)
浙江省中烟科技项目基金(No.ZJZY2015C001)资助~~
关键词
化学计量学
偏最小二乘
定量分析
光谱分析
模型集成
Chemometrics
Partial least squares
Quantitative analysis
Spectrometric analysis
Model ensemble