期刊文献+

基于隐含主题协同注意力网络的领域分类方法 被引量:1

Latent Topic-Enriched Co-Attention Networks for Domain Classification
下载PDF
导出
摘要 基于注意力机制的神经网络模型在文本分类任务中显示出了很好的效果。然而当训练数据的规模有限,或者测试数据与训练数据的分布有较大差异时,一些有价值的信息词很难在训练中被模型捕捉到。为此,该文提出了一种新的基于协同注意力(co-attention)网络的领域分类方法。该文利用隐含主题模型学习隐含主题注意力,并将其引入到文本分类常用的双向长短时记忆网络(BiLSTM)中,与软或硬注意力(soft-or hard-attention)机制一起构成协同注意力。在中文话语领域分类基准语料SMP-ECDT上的实验结果表明,隐含主题协同注意力网络取得了显著优于注意力机制的领域分类效果,比基线注意力机制Soft att、Hard att以及单独的隐含主题注意力机制BTM att分别提高了2.85%、1.86%和1.74%的分类正确率。此外,实验结果还验证了,在额外的未标记数据上训练隐含主题,可以进一步提高该文方法的领域分类性能。 Attention-based bidirectional long short-term memory network(BiLSTM) models have recently shown promising results in text classification tasks.However, when the amount of training data is restricted, or the distribution of the test data is quite different from the training data, some potential informative words are hard to be captured in training.In this work, we propose a new method to learn co-attention for domain classification.Unlike the past attention mechanism guided only by domain tags of training data, we leveroge using the latent topics in the data set to learn topic attention mechanism, and employ it for BiLSTM.Then the co-attention is obtained by combining the topic attention and the network attention.Experiments on the SMP-ECDT benchmark corpus show that the proposed co-attention mechanism outperforms the state-of-the-art soft mechanism, hard attention mechanism and topic attention mechanism in domain classification, by 2.85%, 1.86% and 1.74% accuracy improvement, respectively.
作者 黄培松 黄沛杰 丁健德 艾文程 章锦川 HUANG Peisong;HUANG Peijie;DING Jiande;AI Wencheng;ZHANG Jinchuan(College of Mathematics and Informatics,South China Agricultural University,G uangzhou,Guangdong 510642,China)
出处 《中文信息学报》 CSCD 北大核心 2020年第2期73-79,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金(71472068) 广东省大学生创新训练计划(201810564094)。
关键词 领域分类 协同注意力 隐含主题 BiLSTM domain classification co-attention latent topic BiLSTM
  • 相关文献

参考文献2

二级参考文献93

  • 1余伶俐,蔡自兴,陈明义.语音信号的情感特征分析与识别研究综述[J].电路与系统学报,2007,12(4):76-84. 被引量:27
  • 2董士海,王横.人机交互.北京:北京大学出版社,2003.
  • 3Dahland G E, Yu Dong, Deng u, Acero A. Context?dependent pre- trained deep neural networks for large?vocabulary speech recognition. IEEE Transactions on Audio, Speech & Language Processing, 2012, 200): 30-42.
  • 4Federico M, Bertoldi N, Cettolo M. Irstlm , An open source toolkit for handling large scale language models/ /Proceedings of the Annual Conference of the International Speech Communication Association (Interopeech), Brisbane, Australia, 2008: 1618-1621.
  • 5Mohri M, Pereira F, Riley M. Weighted finite-state trans?ducers in speech recognition. Computer Speech &. Language, 2002, 16(1): 69-88.
  • 6Senior A, Lei Xin. Fine context, low-rank, softplus deep neural networks for mobile speech recognition/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal ProcessingCICASSP). Florence, Italy, 2014.
  • 7Zen Hei-Ga, Tokuda K, Black A W. Statistical parametric speech synthesis. Speech Communication, 2009, 51(11): 1039-1064.
  • 8WU Y J, Wang R H. Minimum generation error training for hmm-based speech synthesis/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (lCASSP). Toulouse, France, 2006.
  • 9Yu K, Young S. Continuous FO modelling for HMM based statistical speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, 2011,19(5): 1071-1079.
  • 10Zen H, Senior A, Schuster M. Statistical parametric speech synthesis using deep neural networks/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal ProcessingCICASSP). Vancouver, Canada, 2013.

共引文献38

同被引文献2

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部