A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning 被引量：2

导出

摘要 Dialogue policy learning(DPL)is a key component in a task-oriented dialogue(TOD)system.Its goal is to decide the next action of the dialogue system,given the dialogue state at each turn based on a learned dialogue policy.Reinforcement learning(RL)is widely used to optimize this dialogue policy.In the learning process,the user is regarded as the environment and the system as the agent.In this paper,we present an overview of the recent advances and challenges in dialogue policy from the perspective of RL.More specifically,we identify the problems and summarize corresponding solutions for RL-based dialogue policy learning.In addition,we provide a comprehensive survey of applying RL to DPL by categorizing recent methods into five basic elements in RL.We believe this survey can shed light on future research in DPL.

作者 Wai-Chung Kwan Hong-Ru Wang Hui-Min Wang Kam-Fai Wong

机构地区 The Systems Engineering and Engineering Management Department

出处《Machine Intelligence Research》 EI CSCD 2023年第3期318-334,共17页 机器智能研究（英文版）

基金 Innovation and Technology Fund(ITF),Government of the Hong Kong Special Administrative Region(HKSAR),China(No.PRP-054-21FX).

关键词 Dialogue policy learning(DPL) task-oriented dialogue system(TOD) reinforcement learning(RL) dialogue system Markov decision process

分类号 TP391.41 [自动化与计算机技术—计算机应用技术] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

同被引文献8

1黄民烈,朱小燕.对话管理中基于槽特征有限状态自动机的方法研究[J].计算机学报,2004,27(8):1092-1101. 被引量：7
2Meng-Yang Zhang,Guo-Hui Tian,Ci-Ci Li,Jing Gong.Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base[J].International Journal of Automation and computing,2018,15(5):582-592. 被引量：7
3赵阳洋,王振宇,王佩,杨添,张睿,尹凯.任务型对话系统研究综述[J].计算机学报,2020,43(10):1862-1896. 被引量：43
4ZHANG Zheng,TAKANOBU Ryuichi,ZHU Qi,HUANG MinLie,ZHU XiaoYan.Recent advances and challenges in task-oriented dialog systems[J].Science China(Technological Sciences),2020,63(10):2011-2027. 被引量：14
5Ying Li,De Xu.Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning[J].International Journal of Automation and computing,2021,18(3):457-467. 被引量：3
6Kai Zhu,Tao Zhang.Deep Reinforcement Learning Based Mobile Robot Navigation:A Review[J].Tsinghua Science and Technology,2021,26(5):674-691. 被引量：22
7Qian-Long Dang,Wei Xu,Yang-Fei Yuan.A Dynamic Resource Allocation Strategy with Reinforcement Learning for Multimodal Multi-objective Optimization[J].Machine Intelligence Research,2022,19(2):138-152. 被引量：2
8Shao Zhifei,Er Meng Joo.A survey of inverse reinforcement learning techniques[J].International Journal of Intelligent Computing and Cybernetics,2012,5(3):293-311. 被引量：1

引证文献2

1徐恺,王振宇,王旭,秦华,龙宇轩.基于强化学习的任务型对话策略研究综述[J].计算机学报,2024,47(6):1201-1231.
2Jingqing Ruan,Kaishen Wang,Qingyang Zhang,Dengpeng Xing,Bo Xu.Learning Top-K Subtask Planning Tree Based on Discriminative Representation Pretraining for Decision-making[J].Machine Intelligence Research,2024,21(4):782-800.

1Qinghai Miao,Yisheng Lv,Min Huang,Xiao Wang,Fei-Yue Wang.Parallel Learning:Overview and Perspective for Computational Learning Across Syn2Real and Sim2Real[J].IEEE/CAA Journal of Automatica Sinica,2023,10(3):603-631. 被引量：16
2YOU Qian,XU Qian,YANG Xin,ZHANG Tao,CHEN Ming.RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning[J].ZTE Communications,2023,21(2):61-69.
3Zhong JI,Jingwei NI,Xiyao LIU,Yanwei PANG.Teachers cooperation:team-knowledge distillation for multiple cross-domain few-shot learning[J].Frontiers of Computer Science,2023,17(2):91-99.
4Enrico Soranzo,Carlotta Guardiani,Wei Wu.The application of reinforcement learning to NATM tunnel design[J].Underground Space,2022,7(6):990-1002.
5Zijian HU,Xiaoguang GAO,Kaifang WAN,Neretin EVGENY,Jinliang LI.Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments[J].Chinese Journal of Aeronautics,2023,36(5):377-391. 被引量：1
6Jiachen Jiao,Wei Tian,Lin Zhang,Bo Li,Junshan Hu,Yufei Li,Dawei Li,Jianlong Zhang.Variable Stiffness Identification and Configuration Optimization of Industrial Robots for Machining Tasks[J].Chinese Journal of Mechanical Engineering,2022,35(5):275-290. 被引量：1
7Dongzhu Chu,Huilin Lan,Yuwen Deng.The key spatial elements and high-density development of urban peninsula:A discussion on the relationship between core and cape[J].Frontiers of Architectural Research,2022,11(5):934-948.
8程雅琴,温春娣,陈晓娜,陈艳君,陈星星.任务导向性训练治疗卒中后患者平衡功能康复效果的meta分析[J].中国卒中杂志,2023,18(6):660-669. 被引量：1
9靳松,魏嘉斓,樊治朋,王霭冬,杨天晨.“敏捷体系”在国有企业提升科技创新水平中的应用探究[J].中文科技期刊数据库（全文版）经济管理,2023(5):80-83.
10Shiqing Liu,Haoyu Zhang,Yaochu Jin.A survey on computationally efficient neural architecture search[J].Journal of Automation and Intelligence,2022,1(1):8-22.

Machine Intelligence Research

2023年第3期

浏览历史

内容加载中请稍等...

A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning 被引量：2

同被引文献8

引证文献2

相关作者

相关机构

相关主题

浏览历史