期刊文献+

基于对比学习和注意力机制的文本分类方法

Text Classification Method Based on Contrastive Learning and Attention Mechanism
下载PDF
导出
摘要 文本分类作为自然语言处理领域的基本任务,在信息检索、机器翻译和情感分析等应用中发挥着重要作用。然而大多数深度模型在预测时未充分考虑训练实例的丰富信息,导致学到的文本特征不够全面。为了充分利用训练实例信息,提出一种基于对比学习和注意力机制的文本分类方法。首先,设计一种有监督对比学习训练策略,旨在优化模型对文本向量表征的检索,提高模型在推理过程中检索到的训练实例的质量;然后,构建注意力机制,对获取的训练文本特征进行注意力分布学习,聚焦关联性更强的相邻实例信息,获得更多隐含的相似特征;最后,将注意力机制与模型网络相结合,融合相邻的训练实例信息,增强模型提取多样性特征的能力,实现全局特征和局部特征的提取。实验结果表明,所提方法在卷积神经网络(CNN)、双向长短期记忆网络(Bi LSTM)、图卷积网络(GCN)、BERT和Ro BERTa等多个模型上都取得了显著的性能提升。以CNN模型为例,其在THUCNews数据集、今日头条数据集和搜狗数据集上宏F1值分别提高了4.15、6.2和1.92个百分点。因此,该方法也为文本分类任务提供了一种有效的解决方案。 Text classification is a basic task in the field of natural language processing and plays an important role in information retrieval,machine translation,sentiment analysis,and other applications.However,most deep learning models do not fully consider the rich information in training instances during inference,resulting in inadequate text feature learning.To leverage training instance information fully,this paper proposes a text classification method based on contrastive learning and attention mechanism.First,a supervised contrastive learning training strategy is designed to optimize the retrieval of text vector representations,thereby improving the quality of the retrieved training instances during the inference process.Second,an attention mechanism is constructed to learn the attention distribution of the obtained training text features,focusing on adjacent instance information with stronger relevance and capturing more implicit similarity features.Finally,the attention mechanism is combined with the model network,fusing information from adjacent training instances to enhance the ability of the model to extract diverse features and achieve global and local feature extraction.The experimental results demonstrate that this method achieves significant improvements on various models,including Convolutional Neural Network(CNN),Bidirectional Long Short-Term Memory(BiLSTM),Graph Convolutional Network(GCN),Bidirectional Encoder Representations from Transformers(BERT),and RoBERTa.For the CNN model,the macro F1 value is increased by 4.15,6.2,and 1.92 percentage points for the THUCNews,Toutiao,and Sogou datasets,respectively.Therefore,this method provides an effective solution for text classification tasks.
作者 钱来 赵卫伟 QIAN Lai;ZHAO Weiwei(School of Information and Communication,National University of Defense Technology,Wuhan 430010,Hubei,China)
出处 《计算机工程》 CAS CSCD 北大核心 2024年第7期104-111,共8页 Computer Engineering
基金 国家部委基金。
关键词 文本分类 深度模型 对比学习 近似最近邻算法 注意力机制 text classification deep model contrastive learning approximate nearest neighbor algorithm attention mechanism
  • 相关文献

参考文献14

二级参考文献95

  • 1张宁,贾自艳,史忠植.使用KNN算法的文本分类[J].计算机工程,2005,31(8):171-172. 被引量:98
  • 2张玉芳,彭时名,吕佳.基于文本分类TFIDF方法的改进与应用[J].计算机工程,2006,32(19):76-78. 被引量:120
  • 3史忠值.神经网络[M].北京:高等教育出版社,2009.
  • 4李彦宏.2012百度年会主题报告:相信技术的力量[R].北京:百度,2013.
  • 5Rumelhart D,Hinton G,Williams R.Learning representationsby back-propagating errors[J].Nature,1986,323(6088):533-536.
  • 6Hinton G,Salakhutdinov R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
  • 7Ding Shi-fei,Zhang Yan-an,Chen Jin-rong,et al.Research onUsing Genetic Algorithms to Optimize Elman Neural Networks[J].Neural Computing and Applications,2013,23(2):293-297.
  • 8Ding Shi-fei,Jia Wei-kuan,Su Chun-yang,et al.Research ofNeural Network Algorithm Based on Factor Analysis and Cluster Analysis[J].Neural Computing and Applications,2011,20(2):297-302.
  • 9Lee T S,Mumford D.Hierarchical Bayesian inference in the vi-sual cortex[J].Optical Society of America,2003,20(7):1434-1448.
  • 10Serre T,Wolf L,Bileschi S,et al.Robust object recognition with cortex-like mechanisms[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2007,29(3):411-426.

共引文献288

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部