期刊文献+

基于深度强化学习的文本生成研究综述 被引量:2

Review of Text Generation Based on Deep Reinforcement Learning
下载PDF
导出
摘要 文本生成任务需要对大量词汇或语句进行表征,且可将其建模为序列决策问题.鉴于深度强化学习(deep reinforcement learning,DRL)在表征及决策方面的优良性能,DRL在文本生成任务中发挥了重要的作用.基于深度强化学习的文本生成方法改变了以最大似然估计为目标的训练机制,有效解决了传统方法中存在的暴露偏差问题.此外,深度强化学习和生成对抗网络的结合进一步提高了文本生成质量,并已取得了显著的成果.本综述将系统阐述深度强化学习在文本生成任务中的应用,介绍经典模型及算法,分析模型特点,探讨未来深度强化学习与文本生成任务融合的前景和挑战. Text generation tasks require representation of a large number of words or statements and can be modeled as sequential decision problems.In view of the excellent performance of deep reinforcement learning in representation and decision-making,it plays an important role in text generation tasks.The text generation method based on deep reinforcement learning changes the training mechanism aiming at maximum likelihood estimation and effectively solves the problem of exposure bias in traditional methods.In addition,the combination of DRL and generative adversarial networks has improved the quality of text generation and has achieved remarkable results.This review will elaborate the application of DRL in text generation tasks,introduce the classical models and algorithms,analyze the characteristics of the models,and discuss the prospects and challenges of the future integration of DRL and text generation tasks.
作者 赵婷婷 宋亚静 李贵喜 王嫄 陈亚瑞 任德华 ZHAO Tingting;SONG Yajing;LI Guixi;WANG Yuan;CHEN Yarui;REN Dehua(College of Artificial Intelligence,Tianjin University of Science&Technology,Tianjin 300457,China)
出处 《天津科技大学学报》 CAS 2022年第2期71-80,共10页 Journal of Tianjin University of Science & Technology
基金 国家自然科学基金资助项目(61976156) 天津市企业科技特派员项目(20YDTPJC00560)。
关键词 深度强化学习 自然语言生成 暴露偏差 生成对抗网络 deep reinforcement learning natural language generation exposure bias generative adversarial network
  • 相关文献

参考文献5

二级参考文献159

  • 1魏英姿 ,赵明扬 .一种基于强化学习的作业车间动态调度方法[J].自动化学报,2005,31(5):765-771. 被引量:19
  • 2秦春秀,赵捧未,刘怀亮.词语相似度计算研究[J].情报理论与实践,2007,30(1):105-108. 被引量:30
  • 3高阳,周如益,王皓,曹志新.平均奖赏强化学习算法研究[J].计算机学报,2007,30(8):1372-1378. 被引量:38
  • 4MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-levelcontrol through deep reinforcement learning [J]. Nature, 2015,518(7540): 529 – 533.
  • 5SILVER D, HUANG A, MADDISON C, et al. Mastering the gameof Go with deep neural networks and tree search [J]. Nature, 2016,529(7587): 484 – 489.
  • 6AREL I. Deep reinforcement learning as foundation for artificialgeneral intelligence [M] //Theoretical Foundations of Artificial GeneralIntelligence. Amsterdam: Atlantis Press, 2012: 89 – 102.
  • 7TEAAURO G. TD-Gammon, a self-teaching backgammon program,achieves master-level play [J]. Neural Computation, 1994,6(2): 215 – 219.
  • 8SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge MA: MIT Press, 1998.
  • 9KEARNS M, SINGH S. Near-optimal reinforcement learning inpolynomial time [J]. Machine Learning, 2002, 49(2/3): 209 – 232.
  • 10KOCSIS L, SZEPESVARI C. Bandit based Monte-Carlo planning[C] //Proceedings of the European Conference on MachineLearning. Berlin: Springer, 2006: 282 – 293.

共引文献758

同被引文献10

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部