摘要
茶叶等级评价是一项复杂的主观性系统工作,从其相关品质数据中提取等级信息,进而建立茶叶等级快速识别方法,这对指导茶叶生产具有重要意义。为建立快速评价白茶等级的判别模型,本研究搜集了200份不同等级的白牡丹白茶,采集其近红外光谱和气相离子迁移谱的原始数据,经过主成分分析或线性判别分析进行数据降维,结合7种分类器算法开展白茶等级评价。结果表明,线性判别分析适合近红外光谱和气相离子迁移谱的原始数据降维;原始数据使用线性判别分析降维后,基于近红外光谱建立的自适应增强(adaptive boosting,Adaboo-st)、K近邻(k-nearest neighbor,KNN)、多层感知机(multilayer perceptron,MLP)、随机森林(random fore-st,RF)、随机梯度下降(stochastic gradient descent,SGD)和支持向量机(support vector machines,SVM)模型的正判率均>94%,模型评价指标AUC≥0.95;基于气相离子迁移谱筛选的图谱数据建立的MLP、SGD和SVM模型的判别率为91%~93%,AUC值为0.94~0.96;基于气相离子迁移谱的标记物质数据构建的Adaboost、决策树(decision tree,DT)、KNN、MLP、SGD和SVM模型正判率均为100%,AUC为1.0,RF的正判率为96%、AUC值为0.98。综上,以近红外光谱和挥发性化合物特征数据作为白茶等级评价的重要参数,分别建成了6个和10个等级判别模型,可准确判定白茶等级,分类器算法适用于这2种类型数据建模。
Tea grade evaluation is a systematic work with complex and subjective.Grade information from its relevant quality data extracted enables to establish rapid identification method,later of which has guiding meaning to tea production.To establish a rapid identification method of white tea grades,200 white tea(Bai Mudan)samples with 4 grades were selected as the research objects in this paper,near infrared spectroscopy and gas chromatography-ion mobility spectrometry were used to collect original data.The data dimensions were reduced by principal component analysis or linear discriminant analysis,combed with 7 data mining classifier algorithms to rapidly evaluate the grades of white tea.Results showed that linear discriminant analysis was suitable for dimensionality reduction of the original data from near infrared spectra and gas chromatography-ion mobility spectrometry.After dimensionality reduction of the original data using linear discriminant analysis,classification algorithm including adaptive boosting(Adaboost),k-nearest neighbor(KNN)and multi-layer perceptron(MLP),and random forests(RF),stochastic gradient descent(SGD)and support vector machines(SVM)were used for establishment of white tea grade discriminant models based on near infrared spectroscopy,the correct rate of these models were greater than 94%,and the AUC of the model evaluation index was≥0.95.The discriminant rates of MLP,SGD and SVM models based on gas chromatography-ion mobility spectrometry filtered spectrum data were 91%~93%and the AUC value were 0.94~0.96.The positive judgment rate of models from Adaboost,decision tree(DT),KNN,MLP,SGD and SVM models based on gas chromatography-ion mobility spectrometry labeled substance data was 100%,and the AUC was 1.0,while the model evaluation index of RF model were 96%and 0.98,respectively.With near infrared spectrum and volatile compound characteristic data as important parameters for white tea grade evaluation,6 and 10 kinds of grade discrimination models were built,which could accurately determine the grade of white tea,and the classifier algorithm was suitable for the modeling of these two types of data.
作者
黄艳
罗玉琴
张灵枝
戴伟东
林智
林刚
孙威江
HUANG Yan;LUO Yuqin;ZHANG Lingzhi;DAI Weidong;LIN Zhi;LIN Gang;SUN Weijiang(Anxi College of Tea Science,Fujian Agriculture and Forestry University,Quanzhou 362400,China;College of Horticulture,Fujian Agriculture and Forestry University,Fuzhou 350002,China;Institute of China White Tea,Fuding 355200,China;College of Tea and Food Science and Technology,Fujian Zhangzhou College of Science&Technology,Zhangzhou 363202,China;Tea Research Institute,Chinese Academy of Agricultural Science,Hangzhou 310008,China;Fujian Royalty Ecological Technology Co.,Ltd.,Fuzhou 350002,China)
出处
《食品工业科技》
CAS
北大核心
2023年第21期348-357,共10页
Science and Technology of Food Industry
基金
中国白茶研究院开放项目(BCYJY2021K01、BCYJY2021K07)
福建省自然科学基金(2019J01413)。
关键词
等级评价
近红外光谱
气相色谱离子迁移谱
分类算法
grade evaluation
near infrared spectroscopy
gas chromatography-ion mobility spectrometry
data mining classification algorithm