期刊文献+

基于异构图卷积网络的小样本短文本分类方法 被引量:11

Method for Few-Shot Short Text Classification Based on Heterogeneous Graph Convolutional Network
下载PDF
导出
摘要 针对小样本短文本分类过程中出现的语义稀疏与过拟合问题,在异构图卷积网络中利用双重注意力机制学习不同相邻节点的重要性和不同节点类型对当前节点的重要性,构建小样本短文本分类模型HGCN-RN。利用BTM主题模型在短文本数据集中提取主题信息,构造一个集成实体和主题信息的短文本异构信息网络,用于解决短文本语义稀疏问题。在此基础上,构造基于随机去邻法和双重注意力机制的异构图卷积网络,提取短文本异构信息网络中的语义信息,同时利用随机去邻法进行数据增强,用于缓解过拟合问题。在3个短文本数据集上的实验结果表明,与LSTM、Text GCN、HGAT等基准模型相比,该模型在每个类别只有10个标记样本的情况下仍能达到最优性能。 To solve the problem of semantic sparseness and overfitting in few-shot classification of short texts,this paper proposes a method for few-shot short text classification,which uses the dual-attention mechanism of a heterogeneous graph convolutional network to learn the importance of different neighbor nodes and the importance of different node types to the current node.The BTM is used to extract topic information from the short text datasets,and then a heterogeneous information network that can integrate entities and topic information is constructed for short texts to solve the problem of semantic sparseness.On this basis,a heterogeneous graph convolutional network using a duallevel attention mechanism and a method for random neighbor reduction is constructed to extract semantic information from the heterogeneous information network.At the same time,the method for random neighbor reduction is used for data enhancement to alleviate the problem of over-fitting.The experimental results on three short text datasets show that compared with the benchmark models such as LSTM,Text GCN and HGAT,the proposed model still achieves state-ofthe-art performance when there are only ten labeled samples in per class.
作者 袁自勇 高曙 曹姣 陈良臣 YUAN Ziyong;GAO Shu;CAO Jiao;CHEN Liangchen(College of Computer Science and Technology,Wuhan University of Technology,Wuhan 430063,China;Library Network Information Center,Yiyang Medical College,Yiyang,Hunan 413046,China;Applied Technology College,China University of Labor Relations,Beijing 100048,China)
出处 《计算机工程》 CAS CSCD 北大核心 2021年第12期87-94,共8页 Computer Engineering
基金 国家自然科学基金(51679180) 中国劳动关系学院中央高校基本科研业务费专项资金项目(21ZYJS017)。
关键词 小样本短文本分类 异构图卷积网络 短文本异构信息网络 BTM主题模型 过拟合 few-shot short text classification heterogeneous graph convolution network heterogeneous information network for short text BTM topic model over fitting
  • 相关文献

参考文献1

共引文献17

同被引文献107

引证文献11

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部