期刊文献+

融合CNN-SAM与GAT的多标签文本分类模型 被引量:4

Multi-Label Text Classification Model Combining CNN-SAM and GAT
下载PDF
导出
摘要 现有基于神经网络的多标签文本分类研究方法存在两方面不足,一是不能全面提取文本信息特征,二是很少从图结构数据中挖掘全局标签之间的关联性。针对以上两个问题,提出融合卷积神经网络-自注意力机制(CNNSAM)与图注意力网络(GAT)的多标签文本分类模型(CS-GAT)。该模型利用多层卷积神经网络与自注意力机制充分提取文本局部与全局信息并进行融合,得到更为全面的特征向量表示;同时将不同文本标签之间的关联性转变为具有全局信息的边加权图,利用多层图注意力机制自动学习不同标签之间的关联程度,将其与文本上下文语义信息进行交互,获取具有文本语义联系的全局标签信息表示;使用自适应融合策略进一步提取两者特征信息,提高模型的泛化能力。在AAPD、RCV1-V2与EUR-Lex三个公开英文数据集上的实验结果表明,该模型所达到的多标签分类效果明显优于其他主流基线模型。 The existing research methods of multi-label text classification based on neural network have two shortcomings:one is that they can not fully extract text information features, and the other is that they rarely mine the association between global labels from graph structure data. To solve the above two problems, this paper proposes a multi-label text classification model(CS-GAT)integrating convolutional neural network self attention mechanism and graph attention network. The model uses multi-layer convolutional neural network and self attention mechanism to fully extract and fuse the local and global information of the text, so as to obtain a more comprehensive feature vector representation. At the same time, the relevance between different text labels is transformed into an edge weighted graph with global information.The multi-layer graph attention mechanism is used to automatically learn the degree of association between different labels, and then interact with the text context semantic information to obtain the global label information representation with text semantic connection. Finally, the adaptive fusion strategy is used to further extract the feature information of the two models to improve the generalization ability of the model. The experimental results on three open English data sets,AAPD, RCV1-V2 and EUR-Lex, show that the multi-label classification effect achieved by this model is significantly better than other mainstream baseline models.
作者 杨春霞 马文文 陈启岗 桂强 YANG Chunxia;MAWenwen;CHEN Qigang;GUI Qiang(School of Automation,Nanjing University of Information Science&Technology,Nanjing 210044,China;Jiangsu Key Laboratory of Big Data Analysis Technology(B-DAT),Nanjing 210044,China;Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology(CICAEET),Nanjing 210044,China)
出处 《计算机工程与应用》 CSCD 北大核心 2023年第5期106-114,共9页 Computer Engineering and Applications
基金 国家自然科学基金(61273229) 江苏省青蓝工程资助项目。
关键词 多标签文本分类 多层卷积神经网络 自注意力机制 多头图注意力机制 multi-label text classification multi-layer convolutional neural network self attention mechanism multiheaded graph attention mechanism
  • 相关文献

参考文献4

二级参考文献8

共引文献84

同被引文献81

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部