Multi-agent reinforcement learning has recently been applied to solve pursuit problems.However,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting...Multi-agent reinforcement learning has recently been applied to solve pursuit problems.However,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting in low rewards and an inability for agents to learn strategies.This paper proposes a deep reinforcement learning(DRL)training method that employs an ensemble segmented multi-reward function design approach to address the convergence problem mentioned before.The ensemble reward function combines the advantages of two reward functions,which enhances the training effect of agents in long episode.Then,we eliminate the non-monotonic behavior in reward function introduced by the trigonometric functions in the traditional 2D polar coordinates observation representation.Experimental results demonstrate that this method outperforms the traditional single reward function mechanism in the pursuit scenario by enhancing agents’policy scores of the task.These ideas offer a solution to the convergence challenges faced by DRL models in long episode pursuit problems,leading to an improved model training performance.展开更多
肥胖症及减重后不能维持健康体质量的核心因素多为食物成瘾,食物成瘾在神经影像学中表现为奖赏网络与认知控制网络间神经环路的失衡。实时功能磁共振成像神经反馈(real time functional magnetic resonance imaging neurofeedback,rtfMR...肥胖症及减重后不能维持健康体质量的核心因素多为食物成瘾,食物成瘾在神经影像学中表现为奖赏网络与认知控制网络间神经环路的失衡。实时功能磁共振成像神经反馈(real time functional magnetic resonance imaging neurofeedback,rtfMRI-NF)作为一种新型生物反馈技术,已被应用于其他物质成瘾领域的临床研究和治疗中。在食物成瘾肥胖症中,rtfMRI-NF同样具有重塑异常脑功能、改善摄食行为并达到减重效果的潜力。本综述总结了肥胖患者食物成瘾的功能磁共振脑成像模型,探讨应用rtfMRI-NF作为其潜在治疗工具的可行神经靶点,并回顾了rtfMRI-NF在肥胖应用中的最新研究进展,为未来rtfMRI-NF在肥胖中的治疗策略和临床指导提供参考。展开更多
基金National Natural Science Foundation of China(Nos.61803260,61673262 and 61175028)。
文摘Multi-agent reinforcement learning has recently been applied to solve pursuit problems.However,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting in low rewards and an inability for agents to learn strategies.This paper proposes a deep reinforcement learning(DRL)training method that employs an ensemble segmented multi-reward function design approach to address the convergence problem mentioned before.The ensemble reward function combines the advantages of two reward functions,which enhances the training effect of agents in long episode.Then,we eliminate the non-monotonic behavior in reward function introduced by the trigonometric functions in the traditional 2D polar coordinates observation representation.Experimental results demonstrate that this method outperforms the traditional single reward function mechanism in the pursuit scenario by enhancing agents’policy scores of the task.These ideas offer a solution to the convergence challenges faced by DRL models in long episode pursuit problems,leading to an improved model training performance.
文摘肥胖症及减重后不能维持健康体质量的核心因素多为食物成瘾,食物成瘾在神经影像学中表现为奖赏网络与认知控制网络间神经环路的失衡。实时功能磁共振成像神经反馈(real time functional magnetic resonance imaging neurofeedback,rtfMRI-NF)作为一种新型生物反馈技术,已被应用于其他物质成瘾领域的临床研究和治疗中。在食物成瘾肥胖症中,rtfMRI-NF同样具有重塑异常脑功能、改善摄食行为并达到减重效果的潜力。本综述总结了肥胖患者食物成瘾的功能磁共振脑成像模型,探讨应用rtfMRI-NF作为其潜在治疗工具的可行神经靶点,并回顾了rtfMRI-NF在肥胖应用中的最新研究进展,为未来rtfMRI-NF在肥胖中的治疗策略和临床指导提供参考。