摘要
为解决农业病虫害问句分类过程中存在公开数据集较少、文本较短、特征稀疏、隐含语义信息较难学习等问题,以火爆农资招商网为数据源,构建了用于农业病虫害问句分类的数据集,提出了一种用于农业病虫害问句分类的深度学习模型BERT;tacked LSTM。首先,BERT部分获取各个问句的字符级语义信息,生成了包含句子级特征信息的隐藏向量。然后,使用堆叠长短期记忆网络(Stacked LSTM)学习到隐藏的复杂语义信息。实验结果表明,与其他对比模型相比,本文模型对农业病虫害问句分类更具优势,F1值达到了95.76%,并在公开通用领域数据集上进行了测试,F1值达到了98.44%,表明了模型具有较好的的泛化性。
In order to solve the thorny problems in the process of classification of agricultural diseases and insect pests questions,such as fewer public data sets,shorter texts and sparse features,and difficult to learn implicit semantic information,using the hot agricultural investment network as the data source,a data set for the classification of agricultural pests and diseases was constructed,and a deep learning model BERT_Stacked LSTM for the classification of agricultural pests and diseases was proposed.Firstly,the BERT obtained the character-level semantic information of each question,and generated a hidden vector containing sentence-level feature information.Then,stacked long short-term memory network(Stacked LSTM)structure was used to learn the hidden complex semantic information.Experimental results showed the effectiveness of the proposed model.Compared with other comparative models,the model proposed had more advantages in classifying agricultural diseases and insect pests questions.The F1 score reached 95.76%,and it was widely used in public.Tested on the domain data set,the F1 score reached 98.44%,indicating that the generalization of the model was also very good.
作者
李林
刁磊
唐詹
柏召
周晗
郭旭超
LI Lin;DIAO Lei;TANG Zhan;BAI Zhao;ZHOU Han;GUO Xuchao(College of Information and Electrical Engineering,China Agricultural University,Beijing 100083,China)
出处
《农业机械学报》
EI
CAS
CSCD
北大核心
2021年第S01期172-177,共6页
Transactions of the Chinese Society for Agricultural Machinery
基金
国家重点研发计划项目(2016YFD0300710)