期刊文献+

基于高光谱成像技术和主成分分析对粉葛年限的鉴别 被引量:2

Identification of the Age of Puerariae Thomsonii Radix Based on Hyperspectral Imaging and Principal Component Analysis
下载PDF
导出
摘要 粉葛是一种药食两用的植物,含葛根素、淀粉、纤维素、维生素等,具有极高的药用和食用价值。相关研究表明粉葛中化学成分的含量与其生长年限密切相关。目前对粉葛年限鉴别主要依靠传统的理化技术,其操作周期长,破坏样品的完整性,无法快速批量检测。高光谱成像技术(HIS)的发展为粉葛年限的快速、无损鉴定提供了新思路。为了避免因生长年限不足而导致粉葛质量问题,采用高光谱成像技术结合机器学习对其年限进行鉴别。然而高光谱图像数据存在冗余性,所含数据量巨大,且波段之间高度相关,容易对后续的分类效果产生影响。采用主成分分析法(PCA)对高光谱数据进行特征提取,并基于全波段和PCA降维后的数据建立了支持向量机(SVM)、逻辑回归(LR)、多层感知机(MLP)、随机森林(RF)四种分类模型,旨在实现对不同年限粉葛的精准鉴别。使用全波段数据建模时,四种不同的分类模型在不同镜头下测试集的精度分别为78.09%、77.03%、81.43%、72.09%;93.11%、93.79%、94.23%、89.77%。其中MLP模型在SN0605VNIR(VNIR)与N3124SWIR(SWIR)镜头下均取得的了最好的效果。使用PCA降维后的数据建模时,四种不同的分类模型在两个镜头下的测试集精度分别为96.12%、87.53%、95.02%、93.41%;99.26%、97.09%、99.16%、97.91%,其中SVM模型在VNIR和SWIR镜头下均取得了最优的预测精度。结果表明,基于PCA构建的模型能优化数据质量,有效降低波段冗余,进一步提高模型分类性能。对模型参数进一步分析,探究了主成分占比对四种模型预测精度的影响。在VNIR镜头下,四种模型的测试集准确率达最高时,其主成分占比分别为65%、75%、80%、45%;在SWIR镜头下,四种模型的测试集准确率达最高时,其主成分所占比分别为20%、60%、35%、30%。其中,PCA-SVM模型的综合效果最佳,在主成分所占比为20%时达到了较高的预测精度(99.28%)。研究结果表明,高光谱成像技术结合机器学习能够实现对粉葛年限的快速、无损、准确鉴别。 Puerariae Thomsonii Radix is a medicinal and edible plant with an extremely high medicinal and edible value containing puerarin,starch,cellulose,vitamins,etc.Extensive research has shown that the content of chemical components in Puerariae Thomsonii Radix is closely related to the growth period.However,much of the research up to now has been descriptive.The main disadvantage of traditional techniques is that the operation cycle is long,and the destructiveness is large,which cannot be tested on a large scale.The development of hyperspectral imaging(HIS)has provided new insights for the rapid non-destructive identification of Puerariae Thomsonii’s age.In order to avoid the quality problems caused by the insufficient growth years of Pueraria,hyperspectral imaging technology combined with machine learning was used in this experiment to identify the years of Pueraria accurately.However,in fact,one major drawback of this approach is that there is a great deal of redundant information in hyperspectral image data.What is more,the huge amount of data and highly correlated between characteristic bands directly increases the difficulty of sample identification.Principal Component Analysis(PCA)has been taken to extract features from the data to avoid an impact on subsequent classification effects.Based on the full band and PCA dimensionality reduction data to achieve accurate identification of different years of age,there are four classification models currently being adopted in research,including support vector machines(SVM),logistic regression(LR),multi-layer perceptron(MLP)and random forest(Random Forest,RF).When using full-band data modeling,the accuracy of four different classification models under different lenses is 78.09%,77.03%,81.43%,72.09%and 93.11%,93.79%,94.23%,89.77%respectively.The MLP model achieved the best effect under both SN0605VNIR(VNIR)and N3124SWIR(SWIR)lenses.When using PCA dimensionality reduction data modeling,the test set accuracy of four different classification models under two lenses is 96.12%,87.53%,95.02%,93.41%and 99.26%,97.09%,99.16%,97.91%respectively,in which SVM has achieved the optimal prediction accuracy under both VNIR and SWIR lenses.In summary,these results show that the method of PCA can effectively improve the model’s prediction accuracy.In addition,in order to explore the influence of principal component content on prediction accuracy,the authors analyzed the model parameters further,and the experimental results showed that under the VNIR lens,the principal components of the four models accounted for 65%,75%,80%and 45%when the accuracy of the test set reached the highest.Under the SWIR lens,when the accuracy of the test set of the four models reached the highest,the proportion of principal components was 20%,60%,35%and 30%,respectively.Among them,the PCA-SVM performed the best comprehensive effect,and high prediction accuracy(99.28%)was achieved with 20%principal components.Therefore,the findings of hyperspectral imaging technology combined with machine learning will be of interest to realisingrapid,non-destructive and high-precision identification of the age of Puerariae Thomsonii Radix.
作者 胡会强 位云朋 徐华兴 张蕾 毛晓波 赵宇平 HU Hui-qiang;WEI Yun-peng;XU Hua-xing;ZHANG Lei;MAO Xiao-bo;ZHAO Yun-ping(School of Electrical Engineering,Zhengzhou University,Zhengzhou 450001,China;National Resource Center for Chinese Materia Medica,China Academy of Chinese Medical Sciences,Beijing 100020,China)
出处 《光谱学与光谱分析》 SCIE EI CAS CSCD 北大核心 2023年第6期1953-1960,共8页 Spectroscopy and Spectral Analysis
基金 国家中医药管理局中医药创新团队及人才支持计划项目(ZYYCXTD-D-202205) 国家重点研发计划项目(2020YFC2006100) 中央本级重大增减支项目(2060302-2101-16)资助。
关键词 高光谱成像 年限鉴别 机器学习 主成分分析 Hyperspectral imaging Identification of Puerariae Thomsonii Radix growth years Machine learning Principal component analysis
  • 相关文献

参考文献8

二级参考文献111

共引文献87

同被引文献9

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部