期刊文献+

基于预训练模型的文本分类网络TextCGA 被引量:2

Text Classification Network Based on Pre-Trained Model
下载PDF
导出
摘要 文本分类问题是NLP领域的经典问题。当前大部分文本分类网络中所使用的RNN网络存在着短期记忆问题,对于长文本无法进行准确分类。为此,首先将语言模型与分类网络两部分工作解耦,将NLP预训练模型应用于文本分类的任务上,并提出TextCGA文本分类网络。网络用预训练模型作为语言模型,使用预训练模型的强大的语义表示能力对文本进行表示;同时为了解决RNN网络在序列长度较长时的短期记忆问题,使用卷积层、RNN层以及Self-At⁃tention层搭建了CGA模块,有效解决长序列建模问题;在网络中设置多个CGA模块,使得模型可以从多个感受野捕捉文本特征。实验结果表明,使用预训练模型的TextCGA文本分类网络能够达到较好的文本分类效果,在测试中比对照方法普遍提高1~2%的准确率。 Text classification is a classic task in NLP field.At present,most of the classification networks are using RNNs while they have the issue of short-term memory,which means long text will not be accurately categorized.The language model and classification network are decou⁃pled,and the NLP pre-trained model is applied to the task of text classification,and the TextCGA network is proposed.The network uses the pre-trained model the language model,and uses the strong semantic representation ability of the pre-trained model to represent the text.Besides,in order to solve the short-term memory problem of RNN when the sequence is long,the CGA block is constructed by using convolution layer,RNN layer and Self-Attention layer,which effectively solves the problem of long-sequence modeling.Multiple CGA blocks are set up in the network to capture text features from multiple receptive fields.The experimental results show that the TextCGA net⁃work using the pre-trained model can achieve better text classification result,and the accuracy is generally improved by 1-2 percentage points compared with other methods in our experiment.
作者 杨玮祺 杜晔 YANG Wei-qi;DU Ye(School of Computer Science and Information Technology,Beijing Jiaotong University,Beijing 100044)
出处 《现代计算机》 2020年第12期52-57,共6页 Modern Computer
关键词 文本分类 预训练模型 CGA模块 TextCGA Text Classification Pre-Trained Model CGA Module TextCGA
  • 相关文献

参考文献3

二级参考文献85

  • 1Ben-David S,Blitzer J,Crammer K,Pereira F.Analysis of representations for domain adaptation.In:Platt JC,Koller D,Singer Y,Roweis ST,eds.Proc.of the Advances in Neural Information Processing Systems 19.Cambridge:MIT Press,2007.137-144.
  • 2Blitzer J,McDonald R,Pereira F.Domain adaptation with structural correspondence learning.In:Jurafsky D,Gaussier E,eds.Proc.of the Int’l Conf.on Empirical Methods in Natural Language Processing.Stroudsburg PA:ACL,2006.120-128.
  • 3Dai WY,Xue GR,Yang Q,Yu Y.Co-Clustering based classification for out-of-domain documents.In:Proc.of the 13th ACM Int’l Conf.on Knowledge Discovery and Data Mining.New York:ACM Press,2007.210-219.[doi:10.1145/1281192.1281218].
  • 4Dai WY,Xue GR,Yang Q,Yu Y.Transferring naive Bayes classifiers for text classification.In:Proc.of the 22nd Conf.on Artificial Intelligence.AAAI Press,2007.540-545.
  • 5Liao XJ,Xue Y,Carin L.Logistic regression with an auxiliary data source.In:Proc.of the 22nd lnt*I Conf.on Machine Learning.San Francisco:Morgan Kaufmann Publishers,2005.505-512.[doi:10.1145/1102351.1102415].
  • 6Xing DK,Dai WY,Xue GR,Yu Y.Bridged refinement for transfer learning.In:Proc.of the Ilth European Conf.on Practice of Knowledge Discovery in Databases.Berlin:Springer-Verlag,2007.324-335.[doi:10.1007/978-3-540-74976-9_31].
  • 7Mahmud MMH.On universal transfer learning.In:Proc.of the 18th Int’l Conf.on Algorithmic Learning Theory.Sendai,2007.135-149.[doi:10,1007/978-3-540-75225-7_14].
  • 8Samarth S,Sylvian R.Cross domain knowledge transfer using structured representations.In:Proc.of the 21st Conf.on Artificial Intelligence.AAAI Press,2006.506-511.
  • 9Bel N,Koster CHA,Villegas M.Cross-Lingual text categorization.In:Proc.of the European Conf.on Digital Libraries.Berlin:Springer-Verlag,2003.126-139.[doi:10.1007/978-3-540-45175-4_13].
  • 10Zhai CX,Velivelli A,Yu B.A cross-collection mixture model for comparative text mining.In:Proc.of the 10th ACM SIGKDD Int’l Conf.on Knowledge Discovery and Data Mining.New York:ACM,2004.743-748.[doi:10.1145/1014052.1014150].

共引文献519

同被引文献15

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部