摘要
利用激光诱导击穿光谱技术结合机器学习算法,对东北5个产地(大兴安岭、集安、恒仁、石柱、抚松)的人参进行产地识别,建立了主成分分析算法分别结合反向传播(BP)神经网络和支持向量机算法的人参产地识别模型.实验采集了5个产地人参共657组在200-975 nm的激光诱导击穿光谱,经光谱数据预处理后,对C,Mg,Ca,Fe,H,N,O等元素的8条特征谱线进行主成分分析,原光谱数据的前3个主成分累积贡献率达到92.50%,且样品在主成分空间中呈现良好的聚集分类.降维后的前3个主成分以2∶1进行随机抽取,分别作为分类算法的训练集和测试集.实验结果表明主成分分析结合BP神经网络及支持向量机的平均识别率分别为99.08%和99.5%.发生误判的原因是集安和石柱两地地理环境的接近而导致的H,O两元素在Ca元素离子发射谱线下的归一化强度相似.本研究为激光诱导击穿光谱技术在人参产地的快速识别提供了方法和参考.
Based on laser-induced breakdown spectroscopy and machine learning algorithms,ginseng origin identification model is established by principal component analysis algorithm combined with back-propagation(BP)neural network and support vector machine algorithm to analyze and identify ginseng from five different origins in northeast China(Daxinganling,Ji’an,Hengren,Shizhu,and Fusong).The experiment collects a total of 657 groups of laser-induced breakdown spectral data from five origins of ginseng at 200–975 nm,reduces the background continuous spectrum of the original spectral data by moving window smoothing method,labels the ginseng LIBS spectral elements according to the American NIST atomic spectral database.Eight characteristic spectral lines of 7 elements Mg,Ca,Fe,C,H,N and O are selected for principal component analysis according to characteristic spectral selection conditions.The cumulative contribution rate of the first three principal components of the original spectral data reaches 92.50%,which represents a large amount of information about the original ginseng LIBS spectrum,and the samples show a good aggregation and classification in the principal component space.After dimension reduction,the first three principal components are randomly selected in a ratio of 2 to 1 and divided into 438 test sets and 219 training sets,which are used as the input values of the classification algorithm.The experimental results show that the principal component analysis combined with the BP neural network algorithm and support vector machine algorithm can correctly identify 217 and 218 spectra of 219 spectra of the test set respectively,and the average recognition rate is 99.08%and 99.5%respectively.The modeling time of BP neural network is 11.545 s shorter than that of the support vector machine.Both models misjudged Ji'an Ginseng as Shi zhu ginseng,and the reason for this misjudgment is that the normalized intensity of H and O under Ca element ion emission spectrum are similar due to the proximity of Ji'an to Shi Zhu in geographical environment.The study presented here demonstrates that laser-induced breakdown spectroscopy combined with machine learning algorithm is a useful technology for rapid identification of ginseng origin and is expected to realize automatic,real-time,rapid and reliable discrimination.
作者
董鹏凯
赵上勇
郑柯鑫
王冀
高勋
郝作强
林景全
Dong Peng-Kai;Zhao Shang-Yong;Zheng Ke-Xin;Wang Ji;Gao Xun;Hao Zuo-Qiang;Lin Jing-Quan(School of Science,Changchun University of Science and Technology,Changchun 130022,China;School of Physics and Electronics,Shandong Normal University,Jinan 250358,China)
出处
《物理学报》
SCIE
EI
CAS
CSCD
北大核心
2021年第4期61-69,共9页
Acta Physica Sinica
基金
国家自然科学基金(批准号:61575030)
吉林省自然科学基金(批准号:20180101283JC,20200301042RQ,20180201033GX,20190302125GGX)
吉林省教育厅(批准号:JJKH20190539KJ)资助的课题.
关键词
激光诱导击穿光谱
机器学习算法
产地识别
人参
laser-induced breakdown spectroscopy
machine learning algorithm
identification of origin
ginseng