摘要
提出一种融合实体信息的图卷积神经网络模型(ETGCN),用于短文本分类.首先,使用实体链接工具抽取短文本中的实体;然后,利用图卷积神经网络对文档、实体和单词进行建模,丰富文本的潜在语义特征;将学习到的单词节点表示与BERT词嵌入进行拼接,通过双向长短期记忆网络,进一步挖掘文本上下文语义特征,再与图神经网络模型得到的文本特征进行融合,用于分类.实验结果表明,该模型在数据集AGNews、R52和MR上的分类准确率分别为88.38%、93.87%和82.87%,优于大部分主流的基线方法.
An entity information fused graph convolutional neural network model(ETGCN)is proposed for short text classificaion.Firstly,the entities in the short text are extracted by the entity link tool.And then,the text,entities and words are modeled by the graph convolutional neural network to enrich the latent semantic features of the text.After that,the learned node representation of the word and the BERT word embedding are spliced and fed to the bidirectional long and short-term memory network to further mine the semantic features of the text context.The semantic features of the text context are merged with the text features obtained by the graph neural network model.The fusion features are used for the classification of the short text.The experimental results show that the classification accuracy of the model on data sets of AGNews,R52 and MR is 88.38%,93.87%and 82.87%,respectively,which is better than other mainstream baseline methods.
作者
王佳宇
李楹
马春梅
吴东昊
姜丽芬
WANG Jiayu;LI Ying;MA Chunmei;WU Donghao;JIANG Lifen(College of Computer and Information Engineering,Tianjin Normal University,Tianjin 300387,China)
出处
《天津师范大学学报(自然科学版)》
CAS
北大核心
2023年第1期67-72,共6页
Journal of Tianjin Normal University:Natural Science Edition
基金
国家自然科学基金资助项目(61902282)
天津市自然科学基金重点项目(18JCYBJC8900,18JCQNJC70200,20JCZDJC000)
天津市教委科研计划资助项目(2018KJ155)
天津市科技发展基金资助项目(JW1702)
广东省科技计划资助项目(2017KQNCX194)。
关键词
短文本分类
实体信息
图卷积神经网络
short text classification
entity information
graph convolutional neural networks