摘要
血红蛋白是人体的一项重要生理指标,浓度异常会导致人体产生各种疾病。红外光谱技术具有简单、无损、快速等优点,非常适合用于生理参数的定量分析。由于光谱背景复杂、有效信息弱,如何提取有效特征变量,构建精准定量模型是个难题。针对此问题,以血液样本和血红蛋白仿体溶液样本光谱数据为研究对象,采用SPXY法、K_S法、duplex法、等间隔划分法四种数据集划分方法划分数据并通过建模对比,优选出最佳数据集划分方法为SPXY法。遍历了SavitzkyGolay一阶求导滤波(S_G1)+小波变换、小波变换+S_G1、标准正态变量变换(SNV)+S_G1三种预处理方法,优选出SNV+S_G1预处理方法。结合串联思想,提出组合区间偏最小二乘法(SiPLS)与连续投影算法(SPA)串联的特征波长优选方法,构建SiPLS-SPA-PLS预测模型,用两组数据对模型进行验证,依据评价指标判断模型的优劣,并与全谱PLS,SPA-PLS和SiPLS三种定量模型相比较。实验结果表明:(1)使用SiPLS-SPA-PLS模型进行定量分析,血液样本的R_(c),R_(p),RMSEC和RMSEP值分别为0.9936,0.9906,0.1992和0.1846,仿体溶液样本的R_(c),R_(p),RMSEC和RMSEP值分别为0.9989,0.9985,1.8489和2.0074。相比全谱PLS,SPA-PLS,SiPLS三种定量模型,R_(c)和R_(p)值最大,RMSEC和RMSEP值最小,该模型最优,可以更精准地实现血红蛋白的定量分析。(2)SiPLS-SPA-PLS定量模型能更加准确地筛选最优波段,两种样本筛选出来的有效波段分别为血液(1144~1264,1606~1798 nm)、仿体溶液(1018~1390,1600~1700 nm),剔除掉仪器的影响因素,大致相同,此方法可以精准优选出特征波长。(3)该模型可以提取有效变量,去除无用噪声影响,血液样本从全谱的700个光谱变量中优选出28个,仿体溶液样本从全谱的1201个光谱变量优选出41个,提高检测速度和预测效率。该方法为血红蛋白快速精准检测提供了一种思路。
Hemoglobin is an important physiological index of the human body.Abnormal concentrations of Hemoglobin will lead to various diseases.Infrared spectroscopy has the advantages of simplicity,non-destructive and rapidity.It is very suitable for the quantitative analysis of physiological parameters.However,the spectral background is complex,and the effective information is weak.How extract effective feature variables and build an accurate quantitative model is a difficult problem.To solve this problem,study the spectral data of blood samples and hemoglobin imitation solution samples,and through modeling and comparison,the best data set division method is SPXY by using SPXY method,K_S method,duplex method and equal interval division method to divide the data.Three data pre-processing methods of Savitzky Golay first-order derivative filter(S_G1)+wavelet transform,wavelet transform+S_G1 and standard normal variable transform(SNV)+S_G1 are traversed,and the best pre-processing method is SNV+S_G1.Combined with the series idea,the characteristic wavelength optimization method of combining Synery interval Patial Least Squares(SiPLS)and Successive Projections Algorithm(SPA)in series is proposed,and so the SiPLS-SPA-PLS prediction model is constructed.The model is verified with two data sets,and the advantages and disadvantages are judged according to the evaluation indexes and compared to the three quantitative models of full spectrum PLS,SPA-PLS and SiPLS.The experimental results show that:(1)using SiPLS-SPA-PLS for quantitative analysis,the values of R C,R P,RMSEC and RMSEP of blood samples are 0.9936,0.9906,0.1992 and 0.1846 respectively,and the values of R C,R P,RMSEC and RMSEP of imitation solution samples are 0.9989,0.9985,1.8489 and 2.0074 respectively.Compared with the three quantitative models of full spectrum PLS,SPA-PLS and SiPLS,the SiPLS-SPA-PLS model is the best.Because the values of R C and R P are the largest and the values of RMSEC and RMSEP are the smallest.This model can realize the quantitative analysis of hemoglobin more accurately.(2)The SiPLS-SPA-PLS quantitative model can screen the optimal wave band more accurately.The effective wave bands screened by the two samples are blood(1144~1264,1606~1798 nm)and imitation solution(1018~1390,1600~1700 nm).The influencing factors of the instrument are roughly the same.This method can accurately optimize the characteristic wavelength.(3)The model can extract effective variables,remove the influence of useless noise,select 28 spectral variables from 700 blood samples and 41 spectral variables from 1201 hemoglobin imitation solution samples to improve the detection speed and prediction efficiency.In short,this method provides an idea for rapid and accurate detection of hemoglobin.
作者
高西娅
张朱珊莹
卢翠翠
蒙泳吉
曹汇敏
郑冬云
张莉
谢勤岚
GAO Xi-ya;ZHANG Zhu-shan-ying;LU Cui-cui;MENG Yong-ji;CAO Hui-min;ZHENG Dong-yun;ZHANG Li;XIE Qin-lan(College of Biomedical Engineering,South-Central Minzu University,Wuhan 430074,China;Key Laboratory of Cognitive Science,State Ethnic Affairs Commission,Wuhan 430074,China;Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis&Treatment,Wuhan 430074,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2023年第1期50-56,共7页
Spectroscopy and Spectral Analysis
基金
国家自然科学基金项目(61501526,61178087)
中南民族大学中央高校基本科研业务费专项资金项目(CZQ22006)资助。