期刊文献+

循环神经网络中基于特征融合的口语理解

Feature fusion based recurrent neural network for spoken language understanding
下载PDF
导出
摘要 口语理解(SLU)性能的好坏对口语对话系统有着至关重要的作用。在对基本循环神经网络及其变体长短时记忆(LSTM)网络和门限循环单元(GRU)网络结构分析的基础上,提出一种特征融合的循环神经网络结构。该结构首先把输入送到隐含层训练得到特征表示;然后该特征信息与源输入及历史输出信息一起送入另一隐含层训练;最后送进输出层得出结果。对上述不同的循环神经网络的结构及提出的模型在ATIS数据库上进行口语理解实验。结果表明,提出的特征融合的循环神经网络结构的性能要优于传统循环神经网络及其变体结构。 The performance of spoken language understanding(SLU)is of fundamental importance to a spoken language dialogue system.A feature fusion based recurrent neural network structure is proposed on the basis of analyzing the structures of basic recurrent neural network(RNN)and its variants of long short-term memory(LSTM)network and gated recurrent unit(GRU)network.In the structure,the input is sent to the hidden layer and trained to obtain feature representation.The feature information,together with the source input and historical output information,is sent to another hidden layer for training,and then sent to the output layer to obtain results.An SLU experiment was carried out on the ATIS database using the above differ?ent recurrent neural network structures and proposed models.The results show that the performance of the feature fusion based recurrent neural network structure is better than that of the conventional recurrent neural network and its variants.
作者 张晶晶 黄浩 胡英 吾守尔.斯拉木 ZHANG Jingjing;HUANG Hao;HU Ying;Wushour Silamu(School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)
出处 《现代电子技术》 北大核心 2018年第20期157-160,共4页 Modern Electronics Technique
基金 国家自然科学基金(61365005) 国家自然科学基金(61663044) 国家自然科学基金(61761041) 新疆大学博士科研启动基金(BS160239)~~
关键词 口语理解 循环神经网络 长短时记忆 门限循环单元 特征融合 自然语言 SLU RNN LSTM GRU features fusion natural language
  • 相关文献

参考文献1

二级参考文献93

  • 1余伶俐,蔡自兴,陈明义.语音信号的情感特征分析与识别研究综述[J].电路与系统学报,2007,12(4):76-84. 被引量:27
  • 2董士海,王横.人机交互.北京:北京大学出版社,2003.
  • 3Dahland G E, Yu Dong, Deng u, Acero A. Context?dependent pre- trained deep neural networks for large?vocabulary speech recognition. IEEE Transactions on Audio, Speech & Language Processing, 2012, 200): 30-42.
  • 4Federico M, Bertoldi N, Cettolo M. Irstlm , An open source toolkit for handling large scale language models/ /Proceedings of the Annual Conference of the International Speech Communication Association (Interopeech), Brisbane, Australia, 2008: 1618-1621.
  • 5Mohri M, Pereira F, Riley M. Weighted finite-state trans?ducers in speech recognition. Computer Speech &. Language, 2002, 16(1): 69-88.
  • 6Senior A, Lei Xin. Fine context, low-rank, softplus deep neural networks for mobile speech recognition/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal ProcessingCICASSP). Florence, Italy, 2014.
  • 7Zen Hei-Ga, Tokuda K, Black A W. Statistical parametric speech synthesis. Speech Communication, 2009, 51(11): 1039-1064.
  • 8WU Y J, Wang R H. Minimum generation error training for hmm-based speech synthesis/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (lCASSP). Toulouse, France, 2006.
  • 9Yu K, Young S. Continuous FO modelling for HMM based statistical speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, 2011,19(5): 1071-1079.
  • 10Zen H, Senior A, Schuster M. Statistical parametric speech synthesis using deep neural networks/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal ProcessingCICASSP). Vancouver, Canada, 2013.

共引文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部