期刊文献+

融合BERT词嵌入和注意力机制的中文文本分类 被引量:16

Chinese Text Classification Based on BERT and Attention
下载PDF
导出
摘要 文本分类是自然语言处理的一个重要领域.近年来,深度学习的方法被广泛应用于文本分类任务中.在处理大规模的数据时,为了兼顾分类的精度和处理效率,本文使用BERT训练词向量作为嵌入层,进一步优化输入语句的词向量,然后用双层的GRU网络作为主体网络,充分提取文本的上下文特征,最后使用注意力机制,将目标语句重点突出,进行文本分类.实验证明,BERT作为嵌入层输入时,有效优化了词向量.同时,文本提出的BBGA模型具有高效的处理能力,在处理THUCNews数据集时,达到了94.34%的精确度,比TextCNN高出5.20%,比BERT;NN高出1.01%. Text classification is a classic problem in natural language processing.In recent years, deep learning models have drawn a lot of attention in such fields.When it comes to large amounts of data, it is important that we take both accuracy and efficiency into account, so we use the BERT pre-trained language model to obtain word vector, and which is later the embedding layer used to further optimize the word vector of the input sentence.Then the Bidirectional GRU network, which is proved to be efficient in processing text, is utilized to extract the contextual feature of the text.Finally, the attention mechanism is applied to highlight the target sentence and perform text classification.Experiments show that when the BERT pre-trained language model is used as the input of the embedding layer, it effectively optimizes the word vector.At the same time, the method used in the text has reached an accuracy of 94.34% when processing the THUCNews data set, which is 5.19% higher than TextCNN and 1.01% higher than BERT;NN.
作者 孙红 陈强越 SUN Hong;CHEN Qiang-yue(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2022年第1期22-26,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61472256,61170277,61703277)资助 上海市教委科研创新重点项目(12zz137)资助 沪江基金项目(C14002)资助。
关键词 文本分类 自然语言处理 BERT 深度学习 text classification natural language processing BERT deep learning
  • 相关文献

参考文献1

二级参考文献3

共引文献11

同被引文献135

引证文献16

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部