期刊文献+

面向任务型对话机器人的多任务联合模型研究 被引量:1

Research on Multi-task Jointing Model for Task Chat Robot
下载PDF
导出
摘要 在任务型对话机器人的搭建过程中,一般需要执行多个自然语言处理的子任务。目前传统的训练方式是将每个子任务独立训练后再进行整合,这样忽视了不同子任务之间的关联性,限制了模型的预测能力。现提出一种Joint-RoBERTa-WWM-of-Theseus压缩联合模型,一方面通过多任务联合学习训练的方式对意图识别、行业识别和语义槽填充3个子任务进行联合训练,并在多分类的子任务中引入Focal loss机制来解决数据分布不平衡的问题;另一方面,模型通过Theseus方法进行压缩,在略微损失精度的前提下,大幅提高模型预测速度,提高模型在生产环境中的实时性与实用性。 In the process of building a task-oriented chatbot,it is generally necessary to execute several subtasks of Natural Language Processing.And the traditional training method is to integrate each subtask after training independently,which will ignore the relevance between different subtasks and limit the predictive power of the model.This paper proposes a compressed jointed model,i.e.,Joint-RoBERTa-WWM-of-Theseus.On the one hand,intention classification,domain classification and semantic slot filling are jointly trained through multi-task joint learning and training.And the focal loss mechanism is introduced to the multi-class classification subtask to solve the problem of data distribution imbalance.On the other hand,the model is compressed by means of Theseus compression method,which greatly improves the prediction speed of the model and improves the applicability and the real-time in the production environment with a slight loss of accuracy.
作者 高作缘 陶宏才 GAO Zuoyuan;TAO Hongcai(School of Computing&Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China)
出处 《成都信息工程大学学报》 2023年第3期251-257,共7页 Journal of Chengdu University of Information Technology
基金 国家自然科学基金资助项目(61806170)。
关键词 RoBERTa-WWM模型 多任务联合学习 Theseus压缩 Focal loss RoBERTa-WWM model multi-task joint learning Theseus compression Focal loss
  • 相关文献

参考文献5

二级参考文献113

  • 1黄波,刘传才.基于加权TextRank的中文自动文本摘要[J].计算机应用研究,2020,37(2):407-410. 被引量:19
  • 2余伶俐,蔡自兴,陈明义.语音信号的情感特征分析与识别研究综述[J].电路与系统学报,2007,12(4):76-84. 被引量:27
  • 3董士海,王横.人机交互.北京:北京大学出版社,2003.
  • 4Dahland G E, Yu Dong, Deng u, Acero A. Context?dependent pre- trained deep neural networks for large?vocabulary speech recognition. IEEE Transactions on Audio, Speech & Language Processing, 2012, 200): 30-42.
  • 5Federico M, Bertoldi N, Cettolo M. Irstlm , An open source toolkit for handling large scale language models/ /Proceedings of the Annual Conference of the International Speech Communication Association (Interopeech), Brisbane, Australia, 2008: 1618-1621.
  • 6Mohri M, Pereira F, Riley M. Weighted finite-state trans?ducers in speech recognition. Computer Speech &. Language, 2002, 16(1): 69-88.
  • 7Senior A, Lei Xin. Fine context, low-rank, softplus deep neural networks for mobile speech recognition/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal ProcessingCICASSP). Florence, Italy, 2014.
  • 8Zen Hei-Ga, Tokuda K, Black A W. Statistical parametric speech synthesis. Speech Communication, 2009, 51(11): 1039-1064.
  • 9WU Y J, Wang R H. Minimum generation error training for hmm-based speech synthesis/ /Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (lCASSP). Toulouse, France, 2006.
  • 10Yu K, Young S. Continuous FO modelling for HMM based statistical speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, 2011,19(5): 1071-1079.

共引文献133

同被引文献6

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部