期刊文献+
共找到111篇文章
< 1 2 6 >
每页显示 20 50 100
UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach 被引量:1
1
作者 Jiawen Kang Junlong Chen +6 位作者 Minrui Xu Zehui Xiong Yutao Jiao Luchao Han Dusit Niyato Yongju Tong Shengli Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第2期430-445,共16页
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers... Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses. 展开更多
关键词 AVATAR blockchain metaverses multi-agent deep reinforcement learning transformer UAVS
下载PDF
Discovering Latent Variables for the Tasks With Confounders in Multi-Agent Reinforcement Learning
2
作者 Kun Jiang Wenzhang Liu +2 位作者 Yuanda Wang Lu Dong Changyin Sun 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第7期1591-1604,共14页
Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that ... Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable(MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience.Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms. 展开更多
关键词 Latent variable model maximum entropy multi-agent reinforcement learning(MARL) multi-agent system
下载PDF
Unleashing the Power of Multi-Agent Reinforcement Learning for Algorithmic Trading in the Digital Financial Frontier and Enterprise Information Systems
3
作者 Saket Sarin Sunil K.Singh +4 位作者 Sudhakar Kumar Shivam Goyal Brij Bhooshan Gupta Wadee Alhalabi Varsha Arya 《Computers, Materials & Continua》 SCIE EI 2024年第8期3123-3138,共16页
In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading... In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading.Our in-depth investigation delves into the intricacies of merging Multi-Agent Reinforcement Learning(MARL)and Explainable AI(XAI)within Fintech,aiming to refine Algorithmic Trading strategies.Through meticulous examination,we uncover the nuanced interactions of AI-driven agents as they collaborate and compete within the financial realm,employing sophisticated deep learning techniques to enhance the clarity and adaptability of trading decisions.These AI-infused Fintech platforms harness collective intelligence to unearth trends,mitigate risks,and provide tailored financial guidance,fostering benefits for individuals and enterprises navigating the digital landscape.Our research holds the potential to revolutionize finance,opening doors to fresh avenues for investment and asset management in the digital age.Additionally,our statistical evaluation yields encouraging results,with metrics such as Accuracy=0.85,Precision=0.88,and F1 Score=0.86,reaffirming the efficacy of our approach within Fintech and emphasizing its reliability and innovative prowess. 展开更多
关键词 Neurodynamic Fintech multi-agent reinforcement learning algorithmic trading digital financial frontier
下载PDF
Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control
4
作者 Yisha Li Ya Zhang +1 位作者 Xinde Li Changyin Sun 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第9期1987-1998,共12页
This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight... This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models. 展开更多
关键词 Human-machine cooperation mixed domain attention mechanism multi-agent reinforcement learning spatio-temporal feature traffic signal control
下载PDF
Collision-free parking recommendation based on multi-agent reinforcement learning in vehicular crowdsensing
5
作者 Xin Li Xinghua Lei +1 位作者 Xiuwen Liu Hang Xiao 《Digital Communications and Networks》 SCIE CSCD 2024年第3期609-619,共11页
The recent proliferation of Fifth-Generation(5G)networks and Sixth-Generation(6G)networks has given rise to Vehicular Crowd Sensing(VCS)systems which solve parking collisions by effectively incentivizing vehicle parti... The recent proliferation of Fifth-Generation(5G)networks and Sixth-Generation(6G)networks has given rise to Vehicular Crowd Sensing(VCS)systems which solve parking collisions by effectively incentivizing vehicle participation.However,instead of being an isolated module,the incentive mechanism usually interacts with other modules.Based on this,we capture this synergy and propose a Collision-free Parking Recommendation(CPR),a novel VCS system framework that integrates an incentive mechanism,a non-cooperative VCS game,and a multi-agent reinforcement learning algorithm,to derive an optimal parking strategy in real time.Specifically,we utilize an LSTM method to predict parking areas roughly for recommendations accurately.Its incentive mechanism is designed to motivate vehicle participation by considering dynamically priced parking tasks and social network effects.In order to cope with stochastic parking collisions,its non-cooperative VCS game further analyzes the uncertain interactions between vehicles in parking decision-making.Then its multi-agent reinforcement learning algorithm models the VCS campaign as a multi-agent Markov decision process that not only derives the optimal collision-free parking strategy for each vehicle independently,but also proves that the optimal parking strategy for each vehicle is Pareto-optimal.Finally,numerical results demonstrate that CPR can accomplish parking tasks at a 99.7%accuracy compared with other baselines,efficiently recommending parking spaces. 展开更多
关键词 Incentive mechanism Non-cooperative VCS game multi-agent reinforcement learning Collision-free parking strategy Vehicular crowdsensing
下载PDF
Safety-Constrained Multi-Agent Reinforcement Learning for Power Quality Control in Distributed Renewable Energy Networks
6
作者 Yongjiang Zhao Haoyi Zhong Chang Cyoon Lim 《Computers, Materials & Continua》 SCIE EI 2024年第4期449-471,共23页
This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature i... This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature in modern power grids.To tackle the unique challenges of voltage control in distributed renewable energy networks,researchers are increasingly turning towards multi-agent reinforcement learning(MARL).However,MARL raises safety concerns due to the unpredictability in agent actions during their exploration phase.This unpredictability can lead to unsafe control measures.To mitigate these safety concerns in MARL-based voltage control,our study introduces a novel approach:Safety-ConstrainedMulti-Agent Reinforcement Learning(SC-MARL).This approach incorporates a specialized safety constraint module specifically designed for voltage control within the MARL framework.This module ensures that the MARL agents carry out voltage control actions safely.The experiments demonstrate that,in the 33-buses,141-buses,and 322-buses power systems,employing SC-MARL for voltage control resulted in a reduction of the Voltage Out of Control Rate(%V.out)from0.43,0.24,and 2.95 to 0,0.01,and 0.03,respectively.Additionally,the Reactive Power Loss(Q loss)decreased from 0.095,0.547,and 0.017 to 0.062,0.452,and 0.016 in the corresponding systems. 展开更多
关键词 Power quality control multi-agent reinforcement learning safety-constrained MARL
下载PDF
Service Function Chain Deployment Algorithm Based on Multi-Agent Deep Reinforcement Learning
7
作者 Wanwei Huang Qiancheng Zhang +2 位作者 Tao Liu YaoliXu Dalei Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第9期4875-4893,共19页
Aiming at the rapid growth of network services,which leads to the problems of long service request processing time and high deployment cost in the deployment of network function virtualization service function chain(S... Aiming at the rapid growth of network services,which leads to the problems of long service request processing time and high deployment cost in the deployment of network function virtualization service function chain(SFC)under 5G networks,this paper proposes a multi-agent deep deterministic policy gradient optimization algorithm for SFC deployment(MADDPG-SD).Initially,an optimization model is devised to enhance the request acceptance rate,minimizing the latency and deploying the cost SFC is constructed for the network resource-constrained case.Subsequently,we model the dynamic problem as a Markov decision process(MDP),facilitating adaptation to the evolving states of network resources.Finally,by allocating SFCs to different agents and adopting a collaborative deployment strategy,each agent aims to maximize the request acceptance rate or minimize latency and costs.These agents learn strategies from historical data of virtual network functions in SFCs to guide server node selection,and achieve approximately optimal SFC deployment strategies through a cooperative framework of centralized training and distributed execution.Experimental simulation results indicate that the proposed method,while simultaneously meeting performance requirements and resource capacity constraints,has effectively increased the acceptance rate of requests compared to the comparative algorithms,reducing the end-to-end latency by 4.942%and the deployment cost by 8.045%. 展开更多
关键词 Network function virtualization service function chain Markov decision process multi-agent reinforcement learning
下载PDF
A survey on multi-agent reinforcement learning and its application
8
作者 Zepeng Ning Lihua Xie 《Journal of Automation and Intelligence》 2024年第2期73-91,共19页
Multi-agent reinforcement learning(MARL)has been a rapidly evolving field.This paper presents a comprehensive survey of MARL and its applications.We trace the historical evolution of MARL,highlight its progress,and di... Multi-agent reinforcement learning(MARL)has been a rapidly evolving field.This paper presents a comprehensive survey of MARL and its applications.We trace the historical evolution of MARL,highlight its progress,and discuss related survey works.Then,we review the existing works addressing inherent challenges and those focusing on diverse applications.Some representative stochastic games,MARL means,spatial forms of MARL,and task classification are revisited.We then conduct an in-depth exploration of a variety of challenges encountered in MARL applications.We also address critical operational aspects,such as hyperparameter tuning and computational complexity,which are pivotal in practical implementations of MARL.Afterward,we make a thorough overview of the applications of MARL to intelligent machines and devices,chemical engineering,biotechnology,healthcare,and societal issues,which highlights the extensive potential and relevance of MARL within both current and future technological contexts.Our survey also encompasses a detailed examination of benchmark environments used in MARL research,which are instrumental in evaluating MARL algorithms and demonstrate the adaptability of MARL to diverse application scenarios.In the end,we give our prospect for MARL and discuss their related techniques and potential future applications. 展开更多
关键词 Benchmark environments multi-agent reinforcement learning multi-agent systems Stochastic games
下载PDF
Performance Evaluation ofMulti-Agent Reinforcement Learning Algorithms
9
作者 Abdulghani M.Abdulghani Mokhles M.Abdulghani +1 位作者 Wilbur L.Walters Khalid H.Abed 《Intelligent Automation & Soft Computing》 2024年第2期337-352,共16页
Multi-Agent Reinforcement Learning(MARL)has proven to be successful in cooperative assignments.MARL is used to investigate how autonomous agents with the same interests can connect and act in one team.MARL cooperation... Multi-Agent Reinforcement Learning(MARL)has proven to be successful in cooperative assignments.MARL is used to investigate how autonomous agents with the same interests can connect and act in one team.MARL cooperation scenarios are explored in recreational cooperative augmented reality environments,as well as realworld scenarios in robotics.In this paper,we explore the realm of MARL and its potential applications in cooperative assignments.Our focus is on developing a multi-agent system that can collaborate to attack or defend against enemies and achieve victory withminimal damage.To accomplish this,we utilize the StarCraftMulti-Agent Challenge(SMAC)environment and train four MARL algorithms:Q-learning with Mixtures of Experts(QMIX),Value-DecompositionNetwork(VDN),Multi-agent Proximal PolicyOptimizer(MAPPO),andMulti-Agent Actor Attention Critic(MAA2C).These algorithms allow multiple agents to cooperate in a specific scenario to achieve the targeted mission.Our results show that the QMIX algorithm outperforms the other three algorithms in the attacking scenario,while the VDN algorithm achieves the best results in the defending scenario.Specifically,the VDNalgorithmreaches the highest value of battle wonmean and the lowest value of dead alliesmean.Our research demonstrates the potential forMARL algorithms to be used in real-world applications,such as controllingmultiple robots to provide helpful services or coordinating teams of agents to accomplish tasks that would be impossible for a human to do.The SMAC environment provides a unique opportunity to test and evaluate MARL algorithms in a challenging and dynamic environment,and our results show that these algorithms can be used to achieve victory with minimal damage. 展开更多
关键词 reinforcement learning RL multi-agent MARL SMAC VDN QMIX MAPPO
下载PDF
An Optimal Control-Based Distributed Reinforcement Learning Framework for A Class of Non-Convex Objective Functionals of the Multi-Agent Network 被引量:2
10
作者 Zhe Chen Ning Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第11期2081-2093,共13页
This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objecti... This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others. The above problem involves complexity simultaneously in the time and space aspects. Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions. In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time. Hence, the minimization of the overall functional belongs to the calculus of variations. Traditional works usually aim to seek the optimal decision function. Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function. This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning(RL) framework. In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection. Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework. 展开更多
关键词 Distributed optimization multi-agent optimal control reinforcement learning(RL)
下载PDF
Coach-assistedmulti-agent reinforcement learning framework for unexpected crashed agents 被引量:2
11
作者 Jian ZHAO Youpeng ZHAO +5 位作者 Weixun WANG Mingyu YANG Xunhan HU Wengang ZHOU Jianye HAO Houqiang LI 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2022年第7期1032-1042,共11页
Multi-agent reinforcement learning is difficult to apply in practice,partially because of the gap between simulated and real-world scenarios.One reason for the gap is that simulated systems always assume that agents c... Multi-agent reinforcement learning is difficult to apply in practice,partially because of the gap between simulated and real-world scenarios.One reason for the gap is that simulated systems always assume that agents can work normally all the time,while in practice,one or more agents may unexpectedly“crash”during the coordination process due to inevitable hardware or software failures.Such crashes destroy the cooperation among agents and lead to performance degradation.In this work,we present a formal conceptualization of a cooperative multi-agent reinforcement learning system with unexpected crashes.To enhance the robustness of the system to crashes,we propose a coach-assisted multi-agent reinforcement learning framework that introduces a virtual coach agent to adjust the crash rate during training.We have designed three coaching strategies(fixed crash rate,curriculum learning,and adaptive crash rate)and a re-sampling strategy for our coach agent.To our knowledge,this work is the first to study unexpected crashes in a multi-agent system.Extensive experiments on grid-world and StarCraft II micromanagement tasks demonstrate the efficacy of the adaptive strategy compared with the fixed crash rate strategy and curriculum learning strategy.The ablation study further illustrates the effectiveness of our re-sampling strategy. 展开更多
关键词 multi-agent system reinforcement learning Unexpected crashed agents
原文传递
Optimal Design of Drying Process of the Potatoes withMulti-Agent Reinforced Deep Learning
12
作者 Mohammad Yaghoub Abdollahzadeh Jamalabadi 《Frontiers in Heat and Mass Transfer》 EI 2024年第2期511-536,共26页
Heat and mass transport through evaporation or drying processes occur in many applications such as food processing,pharmaceutical products,solar-driven vapor generation,textile design,and electronic cigarettes.In this... Heat and mass transport through evaporation or drying processes occur in many applications such as food processing,pharmaceutical products,solar-driven vapor generation,textile design,and electronic cigarettes.In this paper,the transport of water from a fresh potato considered as a wet porous media with laminar convective dry air fluid flow governed by Darcy’s law in two-dimensional is highlighted.Governing equations of mass conservation,momentumconservation,multiphase fluid flowin porousmedia,heat transfer,and transport of participating fluids and gases through evaporation from liquid to gaseous phase are solved simultaneously.In this model,the variable is block locations,the object function is changing water saturation inside the porous medium and the constraint is the constant mass of porous material.It shows that there is an optimal configuration for the purpose of water removal from the specimen.The results are compared with experimental and analyticalmethods Benchmark.Then for the purpose of configuration optimization,multi-agent reinforcement learning(MARL)is used while multiple porous blocks are considered as agents that transfer their moisture content with the environment in a real-world scenario.MARL has been tested and validated with previous conventional effective optimization simulations and its superiority proved.Our study examines and proposes an effective method for validating and testing multiagent reinforcement learning models and methods using a multiagent simulation. 展开更多
关键词 multi-agent reinforced learning heat transfer mass transfer DRYING darcy flow MOISTURE optimization
下载PDF
Multi-Agent Reinforcement Learning for Resource Allocation in Io T Networks with Edge Computing 被引量:9
13
作者 Xiaolan Liu Jiadong Yu +1 位作者 Zhiyong Feng Yue Gao 《China Communications》 SCIE CSCD 2020年第9期220-236,共17页
To support popular Internet of Things(IoT)applications such as virtual reality and mobile games,edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency and d... To support popular Internet of Things(IoT)applications such as virtual reality and mobile games,edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency and distributed data processing.However,it is challenging for multiple users to offload their computation tasks because they are competing for spectrum and computation as well as Radio Access Technologies(RAT)resources.In this paper,we investigate computation offloading mechanism of multiple selfish users with resource allocation in IoT edge computing networks by formulating it as a stochastic game.Each user is a learning agent observing its local network environment to learn optimal decisions on either local computing or edge computing with a goal of minimizing long term system cost by choosing its transmit power level,RAT and sub-channel without knowing any information of the other users.Since users’decisions are coupling at the gateway,we define the reward function of each user by considering the aggregated effect of other users.Therefore,a multi-agent reinforcement learning framework is developed to solve the game with the proposed Independent Learners based Multi-Agent Q-learning(IL-based MA-Q)algorithm.Simulations demonstrate that the proposed IL-based MA-Q algorithm is feasible to solve the formulated problem and is more energy efficient without extra cost on channel estimation at the centralized gateway.Finally,compared with the other three benchmark algorithms,it has better system cost performance and achieves distributed computation offloading. 展开更多
关键词 edge computing multi-agent reinforcement learning internet of things
下载PDF
A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning 被引量:3
14
作者 MA Ye CHANG Tianqing FAN Wenhui 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第3期642-657,共16页
In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the M... In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the Markov decision framework,a single-task multi-decision evolutionary game model based on multi-agent reinforcement learning is proposed to explore the evolutionary rules in the process of a game.The model can improve the result of a evolutionary game and facilitate the completion of the task.First,based on the multi-agent theory,to solve the existing problems in the original model,a negative feedback tax penalty mechanism is proposed to guide the strategy selection of individuals in the group.In addition,in order to evaluate the evolutionary game results of the group in the model,a calculation method of the group intelligence level is defined.Secondly,the Q-learning algorithm is used to improve the guiding effect of the negative feedback tax penalty mechanism.In the model,the selection strategy of the Q-learning algorithm is improved and a bounded rationality evolutionary game strategy is proposed based on the rule of evolutionary games and the consideration of the bounded rationality of individuals.Finally,simulation results show that the proposed model can effectively guide individuals to choose cooperation strategies which are beneficial to task completion and stability under different negative feedback factor values and different group sizes,so as to improve the group intelligence level. 展开更多
关键词 multi-agent reinforcement learning evolutionary game Q-learning
下载PDF
Knowledge transfer in multi-agent reinforcement learning with incremental number of agents 被引量:4
15
作者 LIU Wenzhang DONG Lu +1 位作者 LIU Jian SUN Changyin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第2期447-460,共14页
In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with... In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with a specific number of agents, and can learn well-performed policies. However, if there is an increasing number of agents, the previously learned in may not perform well in the current scenario. The new agents need to learn from scratch to find optimal policies with others,which may slow down the learning speed of the whole team. To solve that problem, in this paper, we propose a new algorithm to take full advantage of the historical knowledge which was learned before, and transfer it from the previous agents to the new agents. Since the previous agents have been trained well in the source environment, they are treated as teacher agents in the target environment. Correspondingly, the new agents are called student agents. To enable the student agents to learn from the teacher agents, we first modify the input nodes of the networks for teacher agents to adapt to the current environment. Then, the teacher agents take the observations of the student agents as input, and output the advised actions and values as supervising information. Finally, the student agents combine the reward from the environment and the supervising information from the teacher agents, and learn the optimal policies with modified loss functions. By taking full advantage of the knowledge of teacher agents, the search space for the student agents will be reduced significantly, which can accelerate the learning speed of the holistic system. The proposed algorithm is verified in some multi-agent simulation environments, and its efficiency has been demonstrated by the experiment results. 展开更多
关键词 knowledge transfer multi-agent reinforcement learning(MARL) new agents
下载PDF
Multi-agent reinforcement learning for edge information sharing in vehicular networks 被引量:3
16
作者 Ruyan Wang Xue Jiang +5 位作者 Yujie Zhou Zhidu Li Dapeng Wu Tong Tang Alexander Fedotov Vladimir Badenko 《Digital Communications and Networks》 SCIE CSCD 2022年第3期267-277,共11页
To guarantee the heterogeneous delay requirements of the diverse vehicular services,it is necessary to design a full cooperative policy for both Vehicle to Infrastructure(V2I)and Vehicle to Vehicle(V2V)links.This pape... To guarantee the heterogeneous delay requirements of the diverse vehicular services,it is necessary to design a full cooperative policy for both Vehicle to Infrastructure(V2I)and Vehicle to Vehicle(V2V)links.This paper investigates the reduction of the delay in edge information sharing for V2V links while satisfying the delay requirements of the V2I links.Specifically,a mean delay minimization problem and a maximum individual delay minimization problem are formulated to improve the global network performance and ensure the fairness of a single user,respectively.A multi-agent reinforcement learning framework is designed to solve these two problems,where a new reward function is proposed to evaluate the utilities of the two optimization objectives in a unified framework.Thereafter,a proximal policy optimization approach is proposed to enable each V2V user to learn its policy using the shared global network reward.The effectiveness of the proposed approach is finally validated by comparing the obtained results with those of the other baseline approaches through extensive simulation experiments. 展开更多
关键词 Vehicular networks Edge information sharing Delay guarantee multi-agent reinforcement learning Proximal policy optimization
下载PDF
Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features 被引量:2
17
作者 Wenzhang Liu Lu Dong +1 位作者 Dan Niu Changyin Sun 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第9期1673-1686,共14页
In multi-agent reinforcement learning(MARL),the behaviors of each agent can influence the learning of others,and the agents have to search in an exponentially enlarged joint-action space.Hence,it is challenging for th... In multi-agent reinforcement learning(MARL),the behaviors of each agent can influence the learning of others,and the agents have to search in an exponentially enlarged joint-action space.Hence,it is challenging for the multi-agent teams to explore in the environment.Agents may achieve suboptimal policies and fail to solve some complex tasks.To improve the exploring efficiency as well as the performance of MARL tasks,in this paper,we propose a new approach by transferring the knowledge across tasks.Differently from the traditional MARL algorithms,we first assume that the reward functions can be computed by linear combinations of a shared feature function and a set of taskspecific weights.Then,we define a set of basic MARL tasks in the source domain and pre-train them as the basic knowledge for further use.Finally,once the weights for target tasks are available,it will be easier to get a well-performed policy to explore in the target domain.Hence,the learning process of agents for target tasks is speeded up by taking full use of the basic knowledge that was learned previously.We evaluate the proposed algorithm on two challenging MARL tasks:cooperative boxpushing and non-monotonic predator-prey.The experiment results have demonstrated the improved performance compared with state-of-the-art MARL algorithms. 展开更多
关键词 Knowledge transfer multi-agent systems reinforcement learning successor features
下载PDF
A Multi-Agent System for Environmental Monitoring Using Boolean Networks and Reinforcement Learning 被引量:7
18
作者 Hanzhong Zheng Dejie Shi 《Journal of Cyber Security》 2020年第2期85-96,共12页
Distributed wireless sensor networks have been shown to be effective for environmental monitoring tasks,in which multiple sensors are deployed in a wide range of the environments to collect information or monitor a pa... Distributed wireless sensor networks have been shown to be effective for environmental monitoring tasks,in which multiple sensors are deployed in a wide range of the environments to collect information or monitor a particular event,Wireless sensor networks,consisting of a large number of interacting sensors,have been successful in a variety of applications where they are able to share information using different transmission protocols through the communication network.However,the irregular and dynamic environment requires traditional wireless sensor networks to have frequent communications to exchange the most recent information,which can easily generate high communication cost through the collaborative data collection and data transmission.High frequency communication also has high probability of failure because of long distance data transmission.In this paper,we developed a novel approach to multi-sensor environment monitoring network using the idea of distributed system.Its communication network can overcome the difficulties of high communication cost and Single Point of Failure(SPOF)through the decentralized approach,which performs in-network computation.Our approach makes use of Boolean networks that allows for a non-complex method of corroboration and retains meaningful information regarding the dynamics of the communication network.Our approach also reduces the complexity of data aggregation process and employee a reinforcement learning algorithm to predict future event inside the environment through the pattern recognition. 展开更多
关键词 multi-agent system reinforcement learning environment monitoring
下载PDF
A new accelerating algorithm for multi-agent reinforcement learning 被引量:1
19
作者 张汝波 仲宇 顾国昌 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2005年第1期48-51,共4页
In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learni... In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by joint-action. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm. 展开更多
关键词 distributed reinforcement learning accelerating algorithm machine learning multi-agent system
下载PDF
A Multi-Agent Reinforcement Learning-Based Collaborative Jamming System: Algorithm Design and Software-Defined Radio Implementation 被引量:1
20
作者 Luguang Wang Fei Song +5 位作者 Gui Fang Zhibin Feng Wen Li Yifan Xu Chen Pan Xiaojing Chu 《China Communications》 SCIE CSCD 2022年第10期38-54,共17页
In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent ja... In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent jamming problem in a multi-user scenario, where the coordination between the jammers is considered. Firstly, a multi-agent Markov decision process (MDP) framework is used to model and analyze the multi-agent jamming problem. Secondly, a collaborative multi-agent jamming algorithm (CMJA) based on reinforcement learning is proposed. Finally, an actual intelligent jamming system is designed and built based on software-defined radio (SDR) platform for simulation and platform verification. The simulation and platform verification results show that the proposed CMJA algorithm outperforms the independent Q-learning method and provides a better jamming effect. 展开更多
关键词 multi-agent reinforcement learning intelligent jamming collaborative jamming software-defined radio platform
下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部