This paper deals with the search-and-rescue tasks of a mobile robot with multiple interesting targets in an unknown dynamic environment.The problem is challenging because the mobile robot needs to search for multiple ...This paper deals with the search-and-rescue tasks of a mobile robot with multiple interesting targets in an unknown dynamic environment.The problem is challenging because the mobile robot needs to search for multiple targets while avoiding obstacles simultaneously.To ensure that the mobile robot avoids obstacles properly,we propose a mixed-strategy Nash equilibrium based Dyna-Q(MNDQ)algorithm.First,a multi-objective layered structure is introduced to simplify the representation of multiple objectives and reduce computational complexity.This structure divides the overall task into subtasks,including searching for targets and avoiding obstacles.Second,a risk-monitoring mechanism is proposed based on the relative positions of dynamic risks.This mechanism helps the robot avoid potential collisions and unnecessary detours.Then,to improve sampling efficiency,MNDQ is presented,which combines Dyna-Q and mixed-strategy Nash equilibrium.By using mixed-strategy Nash equilibrium,the agent makes decisions in the form of probabilities,maximizing the expected rewards and improving the overall performance of the Dyna-Q algorithm.Furthermore,a series of simulations are conducted to verify the effectiveness of the proposed method.The results show that MNDQ performs well and exhibits robustness,providing a competitive solution for future autonomous robot navigation tasks.展开更多
基金supported by the National Natural Science Foundation of China(No.91948303)。
文摘This paper deals with the search-and-rescue tasks of a mobile robot with multiple interesting targets in an unknown dynamic environment.The problem is challenging because the mobile robot needs to search for multiple targets while avoiding obstacles simultaneously.To ensure that the mobile robot avoids obstacles properly,we propose a mixed-strategy Nash equilibrium based Dyna-Q(MNDQ)algorithm.First,a multi-objective layered structure is introduced to simplify the representation of multiple objectives and reduce computational complexity.This structure divides the overall task into subtasks,including searching for targets and avoiding obstacles.Second,a risk-monitoring mechanism is proposed based on the relative positions of dynamic risks.This mechanism helps the robot avoid potential collisions and unnecessary detours.Then,to improve sampling efficiency,MNDQ is presented,which combines Dyna-Q and mixed-strategy Nash equilibrium.By using mixed-strategy Nash equilibrium,the agent makes decisions in the form of probabilities,maximizing the expected rewards and improving the overall performance of the Dyna-Q algorithm.Furthermore,a series of simulations are conducted to verify the effectiveness of the proposed method.The results show that MNDQ performs well and exhibits robustness,providing a competitive solution for future autonomous robot navigation tasks.