Collision avoidance decision-making models of multiple agents in virtual driving environment are studied. Based on the behavioral characteristics and hierarchical structure of the collision avoidance decision-making i...Collision avoidance decision-making models of multiple agents in virtual driving environment are studied. Based on the behavioral characteristics and hierarchical structure of the collision avoidance decision-making in real life driving, delphi approach and mathematical statistics method are introduced to construct pair-wise comparison judgment matrix of collision avoidance decision choices to each collision situation. Analytic hierarchy process (AHP) is adopted to establish the agents' collision avoidance decision-making model. To simulate drivers' characteristics, driver factors are added to categorize driving modes into impatient mode, normal mode, and the cautious mode. The results show that this model can simulate human's thinking process, and the agents in the virtual environment can deal with collision situations and make decisions to avoid collisions without intervention. The model can also reflect diversity and uncertainly of real life driving behaviors, and solves the multi-objective, multi-choice ranking priority problem in multi-vehicle collision scenarios. This collision avoidance model of multi-agents model is feasible and effective, and can provide richer and closer-to-life virtual scene for driving simulator, reflecting real-life traffic environment more truly, this model can also promote the practicality of driving simulator.展开更多
The decision.making process of the public service facility configuration in multi.agent community is usually simplistic and static. In order to reflect dynamic changes and interactions of all behavior subjects indudin...The decision.making process of the public service facility configuration in multi.agent community is usually simplistic and static. In order to reflect dynamic changes and interactions of all behavior subjects induding of residents, real estate developers and the government, a decision-making model of public service facility configuration according to the multi-agent theory was made to improve the efficiency of the public service facility configuration in community and the living quality of residents. Taking a community to the cast of Jinhui Port in Fengxian District in Shanghai for example, the model analyzed the decision-makers' adaptive behaviors and simulated the decision.making criteria. The results indicate that the decision-making model and criteria can be well of satisfying the purpose of improving validity and rationality of public service facility configuration in large community.展开更多
Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experi...Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.展开更多
基金supported by National Basic Research Program (973 Program,No.2004CB719402)National Natural Science Foundation of China (No.60736019)Natural Science Foundation of Zhejiang Province, China(No.Y105430).
文摘Collision avoidance decision-making models of multiple agents in virtual driving environment are studied. Based on the behavioral characteristics and hierarchical structure of the collision avoidance decision-making in real life driving, delphi approach and mathematical statistics method are introduced to construct pair-wise comparison judgment matrix of collision avoidance decision choices to each collision situation. Analytic hierarchy process (AHP) is adopted to establish the agents' collision avoidance decision-making model. To simulate drivers' characteristics, driver factors are added to categorize driving modes into impatient mode, normal mode, and the cautious mode. The results show that this model can simulate human's thinking process, and the agents in the virtual environment can deal with collision situations and make decisions to avoid collisions without intervention. The model can also reflect diversity and uncertainly of real life driving behaviors, and solves the multi-objective, multi-choice ranking priority problem in multi-vehicle collision scenarios. This collision avoidance model of multi-agents model is feasible and effective, and can provide richer and closer-to-life virtual scene for driving simulator, reflecting real-life traffic environment more truly, this model can also promote the practicality of driving simulator.
基金National Natural Science Foundation of China(No.71403173)
文摘The decision.making process of the public service facility configuration in multi.agent community is usually simplistic and static. In order to reflect dynamic changes and interactions of all behavior subjects induding of residents, real estate developers and the government, a decision-making model of public service facility configuration according to the multi-agent theory was made to improve the efficiency of the public service facility configuration in community and the living quality of residents. Taking a community to the cast of Jinhui Port in Fengxian District in Shanghai for example, the model analyzed the decision-makers' adaptive behaviors and simulated the decision.making criteria. The results indicate that the decision-making model and criteria can be well of satisfying the purpose of improving validity and rationality of public service facility configuration in large community.
基金National Natural Science Foundation of China,Grant/Award Number:61872171The Belt and Road Special Foundation of the State Key Laboratory of Hydrology‐Water Resources and Hydraulic Engineering,Grant/Award Number:2021490811。
文摘Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.