期刊文献+

红外光谱的随机森林算法与数据融合策略对绒柄牛肝菌产地鉴别 被引量:5

Infrared Spectral Study on the Origin Identification of Boletus Tomentipes Based on the Random Forest Algorithm and Data Fusion Strategy
下载PDF
导出
摘要 绒柄牛肝菌(Boletus tomentipes Earle)是一种健康食品,受广大消费者的青睐,其子实体营养物质积累量受生长环境(海拔、气候等)影响,不同产地间营养物质含量差异显著,为去劣存优,急需建立一种准确、快速、廉价的产地鉴别技术。采用数据融合策略结合随机森林算法(RF)对绒柄牛肝菌的产地进行鉴别,比较了多种特征值提取方法对RF模型分类效果的影响。扫描来自4个产地(北亚热带、北温带、南亚热带、中亚热带)87个样品不同部位的傅里叶变换近红外光谱和傅里叶变换中红外光谱,分析其光谱特征。通过Kennard-Stone算法将所有样品划分为2/3的训练集(58)和1/3的验证集(29),基于4种红外光谱(近红外的菌柄(N-b)、近红外的菌盖(N-g)、中红外的菌柄(M-b)、中红外的菌盖(M-g))与三种数据融合策略(低级融合、中级融合、高级融合)的数据,结合RF建立产地鉴别模型,比较了不同方法提取的特征值(投影重要性指标值、Boruta、潜在变量)对模型分类效果的影响。其中,根据袋外错误率(oob)选择最优ntree和mtry;以特异性、灵敏度、训练集正确率和验证集正确率评价模型分类性能,综合多种评价指标,找出绒柄牛肝菌产地鉴别的最佳方法。结果表明:(1)近红外和中红外光谱均能反映不同产地绒柄牛肝菌间存在的细微差异。(2)单一光谱结合RF建立判别模型效果不理想。(3)三种融合策略均可提高绒柄牛肝菌的产地鉴定效果,产地鉴别效果优劣依次为高级融合、中级融合、低级融合。通过扫描绒柄牛肝菌近红外和中红外光谱,采用基于特征值LV的高级融合策略,结合RF建立不同产地绒柄牛肝菌鉴别模型,有高验证集正确率(99.6%),高灵敏度(0.969),高特异性(0.986),实现了绒柄牛肝菌产地的准确、快速、廉价鉴别,可以作为绒柄牛肝菌产地溯源的一种可靠方法。 Boletus tomentipes Earleas a kind of healthy food is favored by the majority of consumers.The nutrient accumulation of the fruiting body is affected by the growth environment(altitude,climate,etc.).There is a significant difference in the content of nutrient between different regionsIt is urgent to establish an accurate,rapid and cheap origin identification technology.In this paper,a data fusion strategy combined with random forest algorithm(RF)was used to identify the origin of B.tomentipes,and the effects of various eigenvalue extraction methods on the classification of RF models were compared.Fourier transform near infrared and Fourier transform mid-infrared spectra of 87 samples from 4 producing areas(north subtropics,north temperate zones,south subtropical zones and middle subtropical zones)were scanned to analyze their spectral characteristics.All the sampleswere divided into two thirds of the training set(58)and a third of the validation set(29)by the kennard-stone algorithm.Based on 4 kinds of infrared spectra(near-infrared average spectra of stipes(N-b),near-infrared average spectra of caps(N-g),mid-infrared average spectra of stipes(M-b),mid-infrared average spectra of caps(M-g))and three data fusion strategies(low-level fusion strategies,mid-level fusion strategies,high-level fusion strategies)of data,combining with the RF building identification model,the effects of different characteristic value(variable importance in projection,Boruta,latent variables)on the classification results of the model are compared.Among them,the optimal ntree and mtrywere selected according to oob.The classification performance of the model was evaluated with specificity,sensitivity,training set correctness,and validation set accuracy.Finally,the best method to identify the origin of B.tomentipes was found by multiple evaluation indicators.The results showed that(1)near infrared and middle infrared spectra could identify the origin of B.tomentipes.(2)It is not ideal for establish a discriminant model with a single spectrum combined with RF.(3)All three fusion strategies can improve the origin identification effect of B.tomentipes.Theresults of origin identification from good to bad are in order of high-level fusion,mid-level fusion,low-level fusion.By scanning the near infrared and middle infrared spectra of B.tomentipes,a high-level fusion strategy based on characteristic value LV was adopted,and the identification model of B.tomentipes from different regions was established with RF,which has high verification set accuracy(99.6%),high sensitivity(0.969)and high specificity(0.986).As a reliable method,it can identify the geographical origin of B.tomentipes quickly and accurately.
作者 胡翼然 李杰庆 刘鸿高 范茂攀 王元忠 HU Yi-ran;LI Jie-qing;LIU Hong-gao;FAN Mao-pan;WANG Yuan-zhong(College of Resources and Environment,Yunnan Agricultural University,Kunming 650201,China;College of Agronomy and Biotechnology,Yunnan Agricultural University,Kunming 650201,China;Institute of Medicinal Plants,Yunnan Academy of Agricultural Sciences,Kunming 650200,China)
出处 《光谱学与光谱分析》 SCIE EI CAS CSCD 北大核心 2020年第5期1495-1502,共8页 Spectroscopy and Spectral Analysis
基金 国家自然科学基金项目(31660591) 云南省农业基础研究联合专项基金项目(2018FG001-033)资助。
关键词 绒柄牛肝菌 产地鉴别 数据融合 傅里叶变换中红外光谱 傅里叶变换近红外光谱 Boletus tomentipes Geographic origin identification Data fusion Fourier transform mid-infrared spectrum Fourier transform near infrared spectrum
  • 相关文献

参考文献4

二级参考文献56

共引文献45

同被引文献79

引证文献5

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部