摘要
吉林地理标志大米因其品质优,口感好,营养价值高,在市场上享有较高的声誉,研究地理标志大米产地确证技术具有重要意义。以吉林省柳河县和辉南县为研究区域,分别采集柳河地理标志大米样本62个,辉南地理标志大米样本58个,共120个样本。通过检测大米样本中矿物质元素[铜(Cu)、锌(Zn)、铁(Fe)、锰(Mn)、钾(K)、钙(Ca)、钠(Na)、镁(Mg)、铅(Pb)、镉(Cd)],利用反向传播人工神经网络(BackPropagation Artificial Neural Network,BP-ANN)、随机森林(Random Forest,RF)和支持向量机(Support Vector Machines,SVM)三种机器学习方法建立确证模型,并通过F评分(F-score)方法对矿物质元素进行特征提取,采用10次10折交叉验证和混淆矩阵对研究区域建立的产地确证模型进行评估比较。结果表明:单个Cu元素可作为代表该地区空间特征的典型变量。三种机器学习方法建立的产地确证模型均达到了良好的预测性能,其中BP-ANN方法使用Cu元素和Zn元素建立的分类模型准确率为99. 7%; SVM方法使用Cu元素、Zn元素和Pb元素建立的分类模型准确率为100%; RF方法使用Cu、Zn、Pb、Ca、Cd、K 6种元素建立的分类准确率为100%。RF模型和SVM模型整体分类效果优于BP-ANN模型,模型更稳定,更适合建立研究区域的产地确证模型。
Because of excellent quality, good taste, and high nutritional value, Jilin geographical indication rice have a high reputation in the market. It is of great significance to study the technology for confirming the origin of geographical indication rice. In this research, 62 rice samples were collected from Liuhe county, 58 samples were collected from Huinan county. The contents of mineral elements (Cu, Zn, Fe, Mn, K, Ca, Na, Mg, Pb, Cd) were determined in 120 rice samples. The classification models, based on Back -Propagation Artificial Neural Network (BP- ANN), Random Forest (RF) and Support Vector Machines (SVM), were developed to predict the origin of geographical indication rice. The features of mineral elements were extracted by the F - ~core method. 10 times of 10- fold cross- validation and confusion matrix were used to evaluate and compare the classification models. The results showed that Cu can be used as a typical variable to represent the spatial characteristics of the area. The three correlative models established by the machine learning methods were all achieved good prediction performance. The application of BP - ANN permitted 99.57% correct classification of the samples based on Cu and Zn. The application of SVM permitted 100% correct classification based on Cu,Zn and Pb. The application of RF permitted 100% cor- rect classification based on Cu, Zn, Pb, Ca, Cd, K. The classification accuracy of RF model and SVM model were better than that of BP - ANN model. The models are more stable, and more suitable for establishing the origin classi- fication model.
作者
王靖会
臧妍宇
曹崴
崔浩
郑晖
陈美文
于合龙
Wang Jinghui;Zang Yanyu;Cao Wei;Cui Uao;Zheng Hui;Chen Meiwen;Yu Helong(College of Information Technology,Jilin Agricultural University1,Changchun 130118;College of Food Science and Engineering,Jilin Agricultural University2,Changchun 130118)
出处
《中国粮油学报》
EI
CAS
CSCD
北大核心
2018年第9期123-130,共8页
Journal of the Chinese Cereals and Oils Association
基金
吉林省重点科技研发项目(20180201051NY)
吉林省科技厅项目(2014GB100101)
吉林省科技发展计划项目(20130204046NY)
关键词
反向传播人工神经网络
随机森林
支持向量机
K折交叉验证
混淆矩阵
back propagation artificial neural validation
confusion matrix network
random forest
support vector machine
k -fold cross