1 Introduction.Deep reinforcement learning has achieved great success especially in game[1]and control areas.Unfortunately,for real world environments that involve more than one objectives.For example,an autonomous ca...1 Introduction.Deep reinforcement learning has achieved great success especially in game[1]and control areas.Unfortunately,for real world environments that involve more than one objectives.For example,an autonomous car should consider constraints such as the driving speed,energy efficiency,comfort and safety of the passengers[2,3].To solve the problems,Constrained Markov Decision Process(CMDP)was proposed to model tasks with constraints[4,5].展开更多
1Introduction and main contributions Deep reinforcement learning that considers the advantages of both deep learning and reinforcement learning has achieved success in many fields[1],However,during the learning proces...1Introduction and main contributions Deep reinforcement learning that considers the advantages of both deep learning and reinforcement learning has achieved success in many fields[1],However,during the learning process,a possibility still exists that the agent fails in the task because of falling into hazardous states due to taking improper actions.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.61303108)Natural Science Foundation of Jiangsu Province(BK20211102)+1 种基金Suzhou Key Industries Technological Innovation-Prospective Applied Research Project(SYG201804)A Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.
文摘1 Introduction.Deep reinforcement learning has achieved great success especially in game[1]and control areas.Unfortunately,for real world environments that involve more than one objectives.For example,an autonomous car should consider constraints such as the driving speed,energy efficiency,comfort and safety of the passengers[2,3].To solve the problems,Constrained Markov Decision Process(CMDP)was proposed to model tasks with constraints[4,5].
基金supported by the National Natural Science Foundation of China(Grant No.61303108)Natural Science Foundation of Jiangsu Province(BK20211102)+1 种基金Suzhou Key,Industries Technological Innovation-Prospective_Applied Research Project(SYG201804)A Project Funded by the Priority Academic Program Development of JiangsuHigher Education Institutions.
文摘1Introduction and main contributions Deep reinforcement learning that considers the advantages of both deep learning and reinforcement learning has achieved success in many fields[1],However,during the learning process,a possibility still exists that the agent fails in the task because of falling into hazardous states due to taking improper actions.