期刊文献+
共找到147,261篇文章
< 1 2 250 >
每页显示 20 50 100
Multi-Agent Collaborative Task Planning with Uncertain Task Requirements
1
作者 Jia Zhang Zexuan Jin Qichen Dong 《Journal of Beijing Institute of Technology》 EI CAS 2024年第5期361-373,共13页
In response to the uncertainty of information of the injured in post disaster situations,considering constraints such as random chance and the quantity of rescue resource,the split deliv-ery vehicle routing problem wi... In response to the uncertainty of information of the injured in post disaster situations,considering constraints such as random chance and the quantity of rescue resource,the split deliv-ery vehicle routing problem with stochastic demands(SDVRPSD)model and the multi-depot split delivery heterogeneous vehicle routing problem with stochastic demands(MDSDHVRPSD)model are established.A two-stage hybrid variable neighborhood tabu search algorithm is designed for unmanned vehicle task planning to minimize the path cost of rescue plans.Simulation experiments show that the solution obtained by the algorithm can effectively reduce the rescue vehicle path cost and the rescue task completion time,with high optimization quality and certain portability. 展开更多
关键词 multi-agent collaboration task planning vehicle routing problem stochastic demands
下载PDF
Multi-Agent System for Real Time Planning Using Collaborative Agents 被引量:1
2
作者 Ana Lilia Laureano-Cruces Tzitziki Ramírez-González +1 位作者 Lourdes Sánchez-Guerrero Javier Ramírez-Rodríguez 《International Journal of Intelligence Science》 2014年第4期91-103,共13页
Autonomous agents are an important area of research in the sense that they are proactive, and include: goal-directed and communication capabilities. Furthermore each goals of the agent are constantly changing in a dyn... Autonomous agents are an important area of research in the sense that they are proactive, and include: goal-directed and communication capabilities. Furthermore each goals of the agent are constantly changing in a dynamic environment. Part of the challenge is to automate the process corresponding to each agent in order that they find their own objectives. Agents do not have to work individually, but can work with others and develop a coordinated group of actions. These agents are highly appreciated, when real time problems are involved, meaning that an agent must be able to react within a specific time interval, considering external events. Our work focuses on the design of a multi-agent architecture consisting of autonomous agents capable of acting through a goal-directed with: a) constraints, b) real-time, and c) with incomplete knowledge of the environment. This paper shows a model of collaborative agents architecture that share a common knowledge source, allowing knowledge of the environment;where we analyze it and its changes, choosing the most promising way for achieving the goals of the agent, in order to keep the whole system working, even if a fault occurs. 展开更多
关键词 multi-agent SYSTEMS BLACKBOARD Architecture planning SCHEDULE COLLABORATIVE SYSTEMS Cognitive SYSTEMS
下载PDF
Three-dimensional multi-constraint route planning of unmanned aerial vehicle low-altitude penetration based on coevolutionary multi-agent genetic algorithm 被引量:8
3
作者 彭志红 吴金平 陈杰 《Journal of Central South University》 SCIE EI CAS 2011年第5期1502-1508,共7页
To address the issue of premature convergence and slow convergence rate in three-dimensional (3D) route planning of unmanned aerial vehicle (UAV) low-altitude penetration,a novel route planning method was proposed.Fir... To address the issue of premature convergence and slow convergence rate in three-dimensional (3D) route planning of unmanned aerial vehicle (UAV) low-altitude penetration,a novel route planning method was proposed.First and foremost,a coevolutionary multi-agent genetic algorithm (CE-MAGA) was formed by introducing coevolutionary mechanism to multi-agent genetic algorithm (MAGA),an efficient global optimization algorithm.A dynamic route representation form was also adopted to improve the flight route accuracy.Moreover,an efficient constraint handling method was used to simplify the treatment of multi-constraint and reduce the time-cost of planning computation.Simulation and corresponding analysis show that the planning results of CE-MAGA have better performance on terrain following,terrain avoidance,threat avoidance (TF/TA2) and lower route costs than other existing algorithms.In addition,feasible flight routes can be acquired within 2 s,and the convergence rate of the whole evolutionary process is very fast. 展开更多
关键词 unmanned aerial vehicle (UAV) low-altitude penetration three-dimensional (3D) route planning coevolutionary multiagent genetic algorithm (CE-MAGA)
下载PDF
MULTI-AGENT BASED DISTRIBUTED PROCESS PLANNING MANAGEMENT
4
作者 赵世光 严隽琪 马登哲 《Journal of Shanghai Jiaotong university(Science)》 EI 1999年第2期91-96,共6页
A distributed process planning system based on autonomous multi agent system to solve a distributed process plan task in a manufacturing environment was presented. A distributed agent based process plan structure was ... A distributed process planning system based on autonomous multi agent system to solve a distributed process plan task in a manufacturing environment was presented. A distributed agent based process plan structure was shown to be a viable alternative to hierarchical systems providing real time response to shop floor condition. An outline was done to show how to structure a distributed process plan and how its management may be achieved among manufacturers of parts that form a product. Communication between the agents involved in a distributed process planning was also shown to be important, with the controlling agent having an overall supervision of the plans. Based on the reference model a software tool was developed to realize it. 展开更多
关键词 MULTI AGENT process plan COMMUNICATION COOPERATION
下载PDF
MULTI-AGENT COMPUTER AIDED ASSEMBLY PROCESS PLANNING SYSTEM FOR SHIP HULL
5
作者 Zhang Shijie Jing Shuang Shenyang Institute of Automation, Chinese Academy of Sciences 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2001年第1期57-61,共5页
A multi agent computer aided assembly process planning system (MCAAPP) for ship hull is presented. The system includes system framework, global facilitator, the macro agent structure, agent communication language, ag... A multi agent computer aided assembly process planning system (MCAAPP) for ship hull is presented. The system includes system framework, global facilitator, the macro agent structure, agent communication language, agent oriented programming language, knowledge representation and reasoning strategy. The system can produce the technological file and technological quota, which can satisfy the production needs of factory. 展开更多
关键词 MULTI AGENT Intelligent agent Computer aided process planning Knowledge representation
下载PDF
Communication Model for a Process Planning System Based on a Multi-agent
6
作者 WANG Tao DU Juan WANG Chun-yan LI Yun-xia 《International Journal of Plant Engineering and Management》 2012年第1期28-33,共6页
This paper introduces a process planning system communication model based on a Multi-agent and all levels of the communication process are in described in detail. The KQML( Knowledge Query and Manipulation Language)... This paper introduces a process planning system communication model based on a Multi-agent and all levels of the communication process are in described in detail. The KQML( Knowledge Query and Manipulation Language) language communication is introduced emphatically using the communication performatives of the KQML language to achieve communication between the agents among the process planning. 展开更多
关键词 multi-agent system process planning KQML PERFORMATIVE
下载PDF
Consensus and Trajectory Planning with Input Constraints for Multi-agent Systems 被引量:9
7
作者 YAN Jing GUAN Xin-Ping +1 位作者 LUO Xiao-Yuan YANG Xian 《自动化学报》 EI CSCD 北大核心 2012年第7期1074-1082,共9页
关键词 多智能体系统 轨迹规划 输入 最优控制问题 检测范围 控制算法 成本函数 控制状态
下载PDF
Coordinated dynamic mission planning scheme for intelligent multi-agent systems
8
作者 彭军 文孟飞 +2 位作者 谢国祺 张晓勇 Kuo-chi LIN 《Journal of Central South University》 SCIE EI CAS 2012年第11期3170-3179,共10页
Mission planning was thoroughly studied in the areas of multiple intelligent agent systems,such as multiple unmanned air vehicles,and multiple processor systems.However,it still faces challenges due to the system comp... Mission planning was thoroughly studied in the areas of multiple intelligent agent systems,such as multiple unmanned air vehicles,and multiple processor systems.However,it still faces challenges due to the system complexity,the execution order constraints,and the dynamic environment uncertainty.To address it,a coordinated dynamic mission planning scheme is proposed utilizing the method of the weighted AND/OR tree and the AOE-Network.In the scheme,the mission is decomposed into a time-constraint weighted AND/OR tree,which is converted into an AOE-Network for mission planning.Then,a dynamic planning algorithm is designed which uses task subcontracting and dynamic re-decomposition to coordinate conflicts.The scheme can reduce the task complexity and its execution time by implementing real-time dynamic re-planning.The simulation proves the effectiveness of this approach. 展开更多
关键词 weighted AND/OR tree multiple intelligent agent coordinated dynamic mission planning AOE-NETWORK
下载PDF
基于Multi-Agent的无人机集群体系自主作战系统设计 被引量:1
9
作者 张堃 华帅 +1 位作者 袁斌林 杜睿怡 《系统工程与电子技术》 EI CSCD 北大核心 2024年第4期1273-1286,共14页
针对无人集群自主作战体系设计中的关键问题,提出基于Multi-Agent的无人集群自主作战系统设计方法。建立无人集群各节点的Agent模型及其推演规则;对于仿真系统模块化和通用化的需求,设计系统互操作式接口和无人集群自主作战的交互关系;... 针对无人集群自主作战体系设计中的关键问题,提出基于Multi-Agent的无人集群自主作战系统设计方法。建立无人集群各节点的Agent模型及其推演规则;对于仿真系统模块化和通用化的需求,设计系统互操作式接口和无人集群自主作战的交互关系;开展无人集群系统仿真推演验证。仿真结果表明,所提设计方案不仅能够有效开展并完成自主作战网络生成-集群演化-效能评估的全过程动态演示验证,而且能够通过重复随机试验进一步评估无人集群的协同作战效能,最后总结了集群协同作战的策略和经验。 展开更多
关键词 multi-agent 无人集群 体系设计 协同作战
下载PDF
Reactive and Robust Planning of Moroccan Citrus Chain Based on Multi-agent System and Performance Indicators
10
作者 Hind Moutaoakil Hicham Jamouli 《Journal of Traffic and Transportation Engineering》 2015年第1期19-34,共16页
As part of improving services done for various clients in all Moroccan areas, Moroccan exportation group of fruits and vegetables in collaboration with their packaging units and producers, tends to cooperate in order ... As part of improving services done for various clients in all Moroccan areas, Moroccan exportation group of fruits and vegetables in collaboration with their packaging units and producers, tends to cooperate in order to face with international competitiveness. Indeed, the complexity of networks of partners has led policy-makers to implement new techniques and tools to help control different processes. For this reason, the implementation of a permanent monitoring of different operations ranging from production, packaging, and distribution of perishable products has become paramount. This article aims to propose a model of multi-agent citrus supply chain, based on indicators for monitoring and evaluation of performance of its logistics systems, in order to build a new independent, robust and responsive chain, and to optimize and control the flow of materials and information between the different actors and stakeholders of the chain. 展开更多
关键词 CITRUS supply chain supply chain operation reference multi-agent system EXPORT performance indicator
下载PDF
基于Multi-Agent的水电站变压器故障诊断系统
11
作者 乔丹 马鹏 王琦 《自动化技术与应用》 2024年第7期58-61,65,共5页
为了精准、快速完成水电站变压器的故障诊断,设计基于Multi-Agent的水电站变压器故障诊断系统。变压器状态监控agent将检测到的变压器故障信息发送给系统管理agent,系统管理agent通过通信agent将变压器故障信息发送给变压器故障诊断age... 为了精准、快速完成水电站变压器的故障诊断,设计基于Multi-Agent的水电站变压器故障诊断系统。变压器状态监控agent将检测到的变压器故障信息发送给系统管理agent,系统管理agent通过通信agent将变压器故障信息发送给变压器故障诊断agent,变压器故障诊断agent利用小波变换方法提取变压器故障特征,并将其作为IFOA-SVM模型输入,完成变压器故障分类后,获取变压器故障诊断结果,该结果通过通信agent显示给用户。实验表明,该系统可有效诊断变压器故障诊断,诊断成功率受系统故障信息丢失率的影响较小,诊断耗时、耗能小,并具有较高故障诊断成功率。 展开更多
关键词 multi-agent 水电站 变压器 故障诊断 小波变换
下载PDF
UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach 被引量:1
12
作者 Jiawen Kang Junlong Chen +6 位作者 Minrui Xu Zehui Xiong Yutao Jiao Luchao Han Dusit Niyato Yongju Tong Shengli Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第2期430-445,共16页
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers... Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses. 展开更多
关键词 AVATAR blockchain metaverses multi-agent deep reinforcement learning transformer UAVS
下载PDF
Evolutionary Decision-Making and Planning for Autonomous Driving Based on Safe and Rational Exploration and Exploitation 被引量:2
13
作者 Kang Yuan Yanjun Huang +4 位作者 Shuo Yang Zewei Zhou Yulei Wang Dongpu Cao Hong Chen 《Engineering》 SCIE EI CAS CSCD 2024年第2期108-120,共13页
Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning frame... Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning framework for autonomous driving based on a hybrid data-and model-driven method.First,a data-driven decision-making module based on deep reinforcement learning(DRL)is developed to pursue a rational driving performance as much as possible.Then,model predictive control(MPC)is employed to execute both longitudinal and lateral motion planning tasks.Multiple constraints are defined according to the vehicle’s physical limit to meet the driving task requirements.Finally,two principles of safety and rationality for the self-evolution of autonomous driving are proposed.A motion envelope is established and embedded into a rational exploration and exploitation scheme,which filters out unreasonable experiences by masking unsafe actions so as to collect high-quality training data for the DRL agent.Experiments with a high-fidelity vehicle model and MATLAB/Simulink co-simulation environment are conducted,and the results show that the proposed online-evolution framework is able to generate safer,more rational,and more efficient driving action in a real-world environment. 展开更多
关键词 Autonomous driving DECISION-MAKING Motion planning Deep reinforcement learning Model predictive control
下载PDF
Finite-time Prescribed Performance Time-Varying Formation Control for Second-Order Multi-Agent Systems With Non-Strict Feedback Based on a Neural Network Observer 被引量:1
14
作者 Chi Ma Dianbiao Dong 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第4期1039-1050,共12页
This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eli... This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm. 展开更多
关键词 Finite-time control multi-agent systems neural network prescribed performance control time-varying formation control
下载PDF
Motion Planning for Autonomous Driving with Real Traffic Data Validation 被引量:1
15
作者 Wenbo Chu Kai Yang +1 位作者 Shen Li Xiaolin Tang 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2024年第1期74-86,共13页
Accurate trajectory prediction of surrounding road users is the fundamental input for motion planning,which enables safe autonomous driving on public roads.In this paper,a safe motion planning approach is proposed bas... Accurate trajectory prediction of surrounding road users is the fundamental input for motion planning,which enables safe autonomous driving on public roads.In this paper,a safe motion planning approach is proposed based on the deep learning-based trajectory prediction method.To begin with,a trajectory prediction model is established based on the graph neural network(GNN)that is trained utilizing the INTERACTION dataset.Then,the validated trajectory prediction model is used to predict the future trajectories of surrounding road users,including pedestrians and vehicles.In addition,a GNN prediction model-enabled motion planner is developed based on the model predictive control technique.Furthermore,two driving scenarios are extracted from the INTERACTION dataset to validate and evaluate the effectiveness of the proposed motion planning approach,i.e.,merging and roundabout scenarios.The results demonstrate that the proposed method can lower the risk and improve driving safety compared with the baseline method. 展开更多
关键词 Trajectory prediction Graph neural network Motion planning INTERACTION dataset
下载PDF
Ground threat prediction-based path planning of unmanned autonomous helicopter using hybrid enhanced artificial bee colony algorithm 被引量:1
16
作者 Zengliang Han Mou Chen +1 位作者 Haojie Zhu Qingxian Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第2期1-22,共22页
Unmanned autonomous helicopter(UAH)path planning problem is an important component of the UAH mission planning system.Aiming to reduce the influence of non-complete ground threat information on UAH path planning,a gro... Unmanned autonomous helicopter(UAH)path planning problem is an important component of the UAH mission planning system.Aiming to reduce the influence of non-complete ground threat information on UAH path planning,a ground threat prediction-based path planning method is proposed based on artificial bee colony(ABC)algorithm by collaborative thinking strategy.Firstly,a dynamic threat distribution probability model is developed based on the characteristics of typical ground threats.The dynamic no-fly zone of the UAH is simulated and established by calculating the distribution probability of ground threats in real time.Then,a dynamic path planning method for UAH is designed in complex environment based on the real-time prediction of ground threats.By adding the collision warning mechanism to the path planning model,the flight path could be dynamically adjusted according to changing no-fly zones.Furthermore,a hybrid enhanced ABC algorithm is proposed based on collaborative thinking strategy.The proposed algorithm applies the leader-member thinking mechanism to guide the direction of population evolution,and reduces the negative impact of local optimal solutions caused by collaborative learning update strategy,which makes the optimization performance of ABC algorithm more controllable and efficient.Finally,simulation results verify the feasibility and effectiveness of the proposed ground threat prediction path planning method. 展开更多
关键词 UAH Path planning Ground threat prediction Hybrid enhanced Collaborative thinking
下载PDF
Robust adaptive leaderless consensus of unknown non-minimum phase linear multi-agent systems subject to disturbances and/or unmodeled dynamics 被引量:1
17
作者 Wenji Cao Gang Feng 《Journal of Automation and Intelligence》 2024年第2期92-100,共9页
This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of un... This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of unknown nominal dynamics and also subject to external disturbances and/or unmodeled dynamics. Anovel distributed robust adaptive control strategy is proposed. It is shown that the robust adaptive leaderlessconsensus problem is solved with the proposed control strategy under some sufficient conditions. Two examplesare provided to demonstrate the efficacy of the proposed control strategy. 展开更多
关键词 Cooperative control Non-minimum phase Leaderless consensus multi-agent systems Robust adaptive control
下载PDF
Life cycle assessment as a prospective tool for sustainable agriculture and food planning at a local level 被引量:1
18
作者 Andrea Lulovicova Stephane Bouissou 《Geography and Sustainability》 CSCD 2024年第2期251-264,共14页
Owing to the far-reaching environmental consequences of agriculture and food systems,such as their contribution to climate change,there is an urgent need to reduce their impact.International and national governments s... Owing to the far-reaching environmental consequences of agriculture and food systems,such as their contribution to climate change,there is an urgent need to reduce their impact.International and national governments set sustainability targets and implement corresponding measures.Nevertheless,critics of the globalized system claim that a territorial administrative scale is better suited to address sustainability issues.Yet,at the subnational level,local authorities rarely apply a systemic environmental assessment to enhance their action plans.This paper employs a territorial life cycle assessment methodology to improve local environmental agri-food planning.The objective is to identify significant direct and indirect environmental hotspots,their origins,and formulate effective mitigation strategies.The methodology is applied to the administrative department of Finistere,a strategic agricultural region in North-Western France.Multiple environmental criteria including climate change,fossil resource scarcity,toxicity,and land use are modeled.The findings reveal that the primary environmental hotspots of the studied local food system arise from indirect sources,such as livestock feed or diesel consumption.Livestock reduction and organic farming conversion emerge as the most environmentally efficient strategies,resulting in a 25%decrease in the climate change indicator.However,the overall modeled impact reduction is insufficient following national objectives and remains limited for the land use indicator.These results highlight the innovative application of life cycle assessment led at a local level,offering insights for the further advancement of systematic and prospective local agri-food assessment.Additionally,they provide guidance for local authorities to enhance the sustainability of planning strategies. 展开更多
关键词 Environmental analysis Territorial life cycle assessment Prospective scenario Agri-food planning Local food system
下载PDF
Distributed collaborative complete coverage path planning based on hybrid strategy
19
作者 ZHANG Jia DU Xin +1 位作者 DONG Qichen XIN Bin 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期463-472,共10页
Collaborative coverage path planning(CCPP) refers to obtaining the shortest paths passing over all places except obstacles in a certain area or space. A multi-unmanned aerial vehicle(UAV) collaborative CCPP algorithm ... Collaborative coverage path planning(CCPP) refers to obtaining the shortest paths passing over all places except obstacles in a certain area or space. A multi-unmanned aerial vehicle(UAV) collaborative CCPP algorithm is proposed for the urban rescue search or military search in outdoor environment.Due to flexible control of small UAVs, it can be considered that all UAVs fly at the same altitude, that is, they perform search tasks on a two-dimensional plane. Based on the agents’ motion characteristics and environmental information, a mathematical model of CCPP problem is established. The minimum time for UAVs to complete the CCPP is the objective function, and complete coverage constraint, no-fly constraint, collision avoidance constraint, and communication constraint are considered. Four motion strategies and two communication strategies are designed. Then a distributed CCPP algorithm is designed based on hybrid strategies. Simulation results compared with patternbased genetic algorithm(PBGA) and random search method show that the proposed method has stronger real-time performance and better scalability and can complete the complete CCPP task more efficiently and stably. 展开更多
关键词 multi-agent cooperation unmanned aerial vehicles(UAV) distributed algorithm complete coverage path planning(CCPP)
下载PDF
Discovering Latent Variables for the Tasks With Confounders in Multi-Agent Reinforcement Learning
20
作者 Kun Jiang Wenzhang Liu +2 位作者 Yuanda Wang Lu Dong Changyin Sun 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第7期1591-1604,共14页
Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that ... Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable(MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience.Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms. 展开更多
关键词 Latent variable model maximum entropy multi-agent reinforcement learning(MARL) multi-agent system
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部