期刊文献+

基于生成对抗网络和深度森林结合的粮食加工过程污染物小样本数据扩充及预测

Expansion and Prediction of Small Sample Data of Contaminants in Grain Processing Using Combination of Generative Adversarial Networks and Deep Forest
下载PDF
导出
摘要 粮食加工过程污染物的准确预测对粮食安全具有重要意义,但由于粮食加工工艺复杂,污染物检测困难导致数据量较小,难以满足建模预测所需,需要研究小样本的污染物数据扩充方法。同时,较小样本量的粮食加工过程污染物数据往往缺乏足够的先验知识,传统监督学习的方法对其预测精度较低,且现有连续型深度学习模型不适于粮食加工过程这一间歇过程,需研究基于无监督学习和离散深度学习的粮食加工过程污染物预测方法。为此,本文针对粮食加工过程污染物提出基于时间生成对抗网络(time generative adversarial networks,TimeGAN)的数据扩充及基于生成对抗网络(generative adversarial networks,GAN)和深度森林(deep forest,DF)结合的预测方法。首先构建TimeGAN模型,对小样本数据学习后得到多组样本数据,实现数据扩充;将无监督学习的GAN模型与适用于离散过程的DF模型结合,构建GAN-DF模型,实现污染物预测;再分别将DF与长短时记忆(long short-term memory,LSTM)-DF模型作为生成器嵌入到GAN,构建DFGAN与LSTM-DFGAN模型,进一步提高污染物预测的准确度。通过稻谷加工过程的金属污染物Pb数据(Pb含量)进行仿真验证,结果表明TimeGAN方法扩充数据可行,LSTM-DFGAN模型的综合预测效果最好,其扩充数据后的预测平均绝对误差和均方根误差低至7.50×10^(-5)mg/kg和1.60×10^(-8)mg/kg。 Accurate prediction of pollutants in grain processing is of great significance to ensure food safety.However,due to the complexity of grain processing and the difficulty of pollutant detection,the data volume is too small to meet the needs of modeling and forecasting,so it is necessary to develop a method for expanding pollutant data from small samples.At the same time,pollutant data of small samples in grain processing often lacks sufficient prior knowledge.Traditional supervised learning method has low prediction accuracy,and the existing continuous deep learning model is not suitable for grain processing,being intermittent.Hence,there is a need to develop a prediction method based on unsupervised learning and deep learning for pollutants in grain processing.This study proposed a prediction method for pollutants in grain processing based on data expansion with time generative adversarial networks(TimeGAN)or based on generative adversarial networks(GAN)combined with deep forest(DF).First,a TimeGAN model was constructed to learn from small sample data and generate multiple sets of sample data,achieving data augmentation.Then,combining the GAN model with unsupervised learning with the DF model suitable for a discrete process,a GAN-DF model was constructed for pollutant prediction.Next,the DF and long short-term memory(LSTM)-DF models were embedded into GAN as generators,separately,and the resulting DFGAN and LSTM-DFGAN models had improved accuracy in pollutant prediction.The results of simulation and verification using the data of the heavy metal pollutant lead(Pb)in rice processing showed that the TimeGAN method was feasible to expand data,and the LSTM-DFGAN model had the best comprehensive prediction performance.After data expansion,the average absolute error and root mean square error were as low as 7.50×10^(-5) and 1.60×10^(-8) mg/kg,respectively.
作者 郭香兰 王立 金学波 于家斌 白玉廷 李涵宇 隗立昂 马倩 温浩然 GUO Xianglan;WANG Li;JIN Xuebo;YU Jiabin;BAI Yuting;LI Hanyu;WEI Li’ang;MA Qian;WEN Haoran(School of Computer and Artificial Intelligence,Beijing Technology and Business University,Beijing 102488,China)
出处 《食品科学》 EI CAS CSCD 北大核心 2024年第12期22-30,共9页 Food Science
基金 “十三五”国家重点研发计划重点专项(2020YFC1606801)。
关键词 生成对抗网络 深度森林 粮食加工 污染物预测 generative adversarial networks deep forest grain processing pollutant prediction
  • 相关文献

参考文献20

二级参考文献181

共引文献527

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部