摘要
水稻是中国主要粮食作物,而水稻品质与其生长的外部环境如土壤特性、气候、日照时间和灌溉水等环境息息相关,高品质水稻的产地区域面积有一定地域限制,因此水稻可看成为是一个明显的地理标志物。市场常出现一些假冒或者贴牌的知名优质水稻出售,损害了水稻品牌,降低了消费者的水稻品质保障,并且扰乱了市场稳定性,因此对于水稻产地快速识别技术的需求十分迫切。利用LIBS结合机器学习算法,对吉林省5个产地(大安、公主岭、前郭、松原、洮儿河)的水稻进行产地识别,建立了主成分分析(PCA)算法分别结合Bagged Trees、Weighted KNN、Quadratic SVM和Coarse Gaussian SVM共四种机器学习算法的水稻产地识别模型。实验选取了5个水稻产地共450组在200~900 nm的LIBS数据,对水稻LIBS光谱数据采用卷积平滑(S-G平滑)进行降噪和特征谱线归一化预处理,对水稻LIBS光谱数据进行主成分分析,实现了水稻产地具有较好的聚类空间集群分布,但部分水稻产地存在空间重叠。采用5倍交叉验证,采用PCA-Bagged Trees、PCA-Weighted KNN、PCA-Quadratic SVM和PCA-Coarse Gaussian SVM共四种机器学习模型,水稻产地的识别精度均达到91.8%以上,并且PCA-Quadratic SVM模型的识别精度高达97.3%。结果表明结合LIBS技术和机器学习算法能够高精度和高效率实现水稻产地的识别。
Rice is the primary grain crop in China,and the quality of rice is closely related to the external environment,such as soil characteristics,climate,sunshine time,and irrigation water.The high-quality rice-origin area has certain regional limitations.Therefore,the rice can be seen as an apparent geographical marker.There are often some counterfeits or branded famous high-quality rice in the market,which can damage the rice brand,reduce the rice quality guarantee of consumers,and disturb the market stability,so rapid identification technology of rice origin is needed.The rice origin identification models of five sources in Jilin Province(Daan,Gongzhuling,Qianguo,Songyuan and Taoerhe)are done by laser-induced breakdown spectroscopy and machine learning algorithms.The principal component analysis(PCA)algorithm,combined with four machine learning algorithms,Bagged Trees,Weighted KNN,Quadratic SVM,and Coaster Gaussian SVM,has been established.A total of 450 groups of LIBS data are selected.The spectral data of rice LIBS are pretreated with Savitzky-Golay smoothing(S-G smoothing)for noise reduction and normalisation.The principal component analysis uses the rice LIBS data,which shows that the rice origins had an excellent cluster distribution of clustering spaces.Still,there is spatial overlap in some rice origins.Utilising5x cross-validation,the identification accuracy of rice origins can reachmore than 91.8%by adopting PCA-Bagged Trees,PCA-Weighted KNN,PCA-Quadratic SVM and PCA-Coarse Gaussian SVM,and the recognition accuracy of PCA-Quadratic SVM model is as high as 97.3%.The results show that the combination of LIBS technology and machine learning algorithms can identify rice origin with high precision and high efficiency.
作者
宋少忠
符少燕
刘园园
齐春艳
李景鹏
高勋
SONG Shao-zhong;FU Shao-yan;LIU Yuan-yuan;QI Chun-yan;LI Jing-peng;GAO Xun(School of Data Science and Artificial Intelligence,Jilin Normal University of Engineering and Technology,Changchun 130052,China;School of Physics,Changchun University of Science and Technology,Changchun 130022,China;Rice Research Institute,Jilin Academy of Agricultural Sciences,Changchun 130033,China;Northeast Institute of Geography and Agroecology,Chinese Academy of Sciences,Changchun 130102,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2024年第6期1553-1558,共6页
Spectroscopy and Spectral Analysis
基金
国家自然科学基金项目(61575030)
吉林省科技厅项目(2020122348JC,20200602054ZP)
吉林省发改委项目(2020C019-6)资助
关键词
激光诱导击穿光谱
机器学习算法
水稻产地识别
识别精度
Laser-induced breakdown spectroscopy
Machine learning algorithms
Identification of rice production areas
Identification accuracy