摘要
文本分类是自然语言处理中一个重要的研究课题。近年来,图神经网络(graph neural network,GNN)在这一典型任务中取得了良好的效果。目前基于图结构的文本分类方法存在边噪声和节点噪声干扰、缺乏文本层次信息和位置信息等问题。为了解决这些问题,提出了一种基于正则约束的分层仿射图神经网络文本分类模型Text-HARC,该模型融合了图注意力网络(graph attention network,GAT)与门控图神经网络(gated graph neural network,GGNN),引入正则约束过滤节点与边噪声,分别使用仿射模块与相对位置编码补充词语表示。通过实验,该方法在TREC、SST1、SST2、R8四个基准数据集上的准确率提升明显,消融实验结果也验证了该方法的有效性。
Text classification is an important research topic in natural language processing.In recent years,graph neural network(GNN)has achieved good results in this typical task.At present,the text classification methods based on graph structure have some problems,such as the interference of edge noise and node noise,the lack of text level information and location information.In order to solve these problems,we propose a hierarchical affine graph neural network text classification model Text-HARC based on regular constraints.The model integrates graph attention networks and gated graph neural networks,introduces regular constraints to filter node and edge noise,and uses affine module and relative position encoding to supplement words respectively.Through experiments,the accuracy of this method is significantly improved on TREC,SST1,SST2 and R8 benchmark data sets.The results of ablation experiments also verify the effectiveness of this method.
作者
甘玲
刘菊
GAN Ling;LIU Ju(School of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R.China;School of Software Engineering,Chongqing University of Posts and Telecommunications Chongqing 400065,P.R.China)
出处
《重庆邮电大学学报(自然科学版)》
CSCD
北大核心
2023年第4期715-721,共7页
Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金
国家自然科学基金项目(61272195)。
关键词
文本分类
图神经网络
信息融合
正则约束
分层仿射
text classification
graph neural network
information fusion
regular constraint
hierarchical affine