期刊文献+
共找到10,958篇文章
< 1 2 250 >
每页显示 20 50 100
Designing Proportional-Integral Consensus Protocols for Second-Order Multi-Agent Systems Using Delayed and Memorized State Information
1
作者 Honghai Wang Qing-Long Han 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第4期878-892,共15页
This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consens... This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems. 展开更多
关键词 Consensus protocol Hurwitz stability multi-agent systems quasi-polynomials time delay
下载PDF
Finite-time Prescribed Performance Time-Varying Formation Control for Second-Order Multi-Agent Systems With Non-Strict Feedback Based on a Neural Network Observer
2
作者 Chi Ma Dianbiao Dong 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第4期1039-1050,共12页
This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eli... This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm. 展开更多
关键词 Finite-time control multi-agent systems neural network prescribed performance control time-varying formation control
下载PDF
Performance Evaluation ofMulti-Agent Reinforcement Learning Algorithms
3
作者 Abdulghani M.Abdulghani Mokhles M.Abdulghani +1 位作者 Wilbur L.Walters Khalid H.Abed 《Intelligent Automation & Soft Computing》 2024年第2期337-352,共16页
Multi-Agent Reinforcement Learning(MARL)has proven to be successful in cooperative assignments.MARL is used to investigate how autonomous agents with the same interests can connect and act in one team.MARL cooperation... Multi-Agent Reinforcement Learning(MARL)has proven to be successful in cooperative assignments.MARL is used to investigate how autonomous agents with the same interests can connect and act in one team.MARL cooperation scenarios are explored in recreational cooperative augmented reality environments,as well as realworld scenarios in robotics.In this paper,we explore the realm of MARL and its potential applications in cooperative assignments.Our focus is on developing a multi-agent system that can collaborate to attack or defend against enemies and achieve victory withminimal damage.To accomplish this,we utilize the StarCraftMulti-Agent Challenge(SMAC)environment and train four MARL algorithms:Q-learning with Mixtures of Experts(QMIX),Value-DecompositionNetwork(VDN),Multi-agent Proximal PolicyOptimizer(MAPPO),andMulti-Agent Actor Attention Critic(MAA2C).These algorithms allow multiple agents to cooperate in a specific scenario to achieve the targeted mission.Our results show that the QMIX algorithm outperforms the other three algorithms in the attacking scenario,while the VDN algorithm achieves the best results in the defending scenario.Specifically,the VDNalgorithmreaches the highest value of battle wonmean and the lowest value of dead alliesmean.Our research demonstrates the potential forMARL algorithms to be used in real-world applications,such as controllingmultiple robots to provide helpful services or coordinating teams of agents to accomplish tasks that would be impossible for a human to do.The SMAC environment provides a unique opportunity to test and evaluate MARL algorithms in a challenging and dynamic environment,and our results show that these algorithms can be used to achieve victory with minimal damage. 展开更多
关键词 Reinforcement learning RL multi-agent maRL SmaC VDN QMIX maPPO
下载PDF
基于Multi-Agent的无人机集群体系自主作战系统设计
4
作者 张堃 华帅 +1 位作者 袁斌林 杜睿怡 《系统工程与电子技术》 EI CSCD 北大核心 2024年第4期1273-1286,共14页
针对无人集群自主作战体系设计中的关键问题,提出基于Multi-Agent的无人集群自主作战系统设计方法。建立无人集群各节点的Agent模型及其推演规则;对于仿真系统模块化和通用化的需求,设计系统互操作式接口和无人集群自主作战的交互关系;... 针对无人集群自主作战体系设计中的关键问题,提出基于Multi-Agent的无人集群自主作战系统设计方法。建立无人集群各节点的Agent模型及其推演规则;对于仿真系统模块化和通用化的需求,设计系统互操作式接口和无人集群自主作战的交互关系;开展无人集群系统仿真推演验证。仿真结果表明,所提设计方案不仅能够有效开展并完成自主作战网络生成-集群演化-效能评估的全过程动态演示验证,而且能够通过重复随机试验进一步评估无人集群的协同作战效能,最后总结了集群协同作战的策略和经验。 展开更多
关键词 multi-agent 无人集群 体系设计 协同作战
下载PDF
UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach 被引量:1
5
作者 Jiawen Kang Junlong Chen +6 位作者 Minrui Xu Zehui Xiong Yutao Jiao Luchao Han Dusit Niyato Yongju Tong Shengli Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第2期430-445,共16页
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers... Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses. 展开更多
关键词 AVATAR blockchain metaverses multi-agent deep reinforcement learning transformer UAVS
下载PDF
基于MAS(Multi-AgentSystem)的多机器人系统:协作多机器人学发展的一个重要方向 被引量:20
6
作者 陈忠泽 林良明 颜国正 《机器人》 EI CSCD 北大核心 2001年第4期368-373,共6页
机器人的应用方式正在由部件式单元应用向系统式应用方向发展 .这是实际应用的需要 ,也是技术发展的必然趋势 ;相关技术如计算机网络技术的发展也为它的实现提供了相应支持 .多机器人协作理论问题必然也已经成为机器人学研究的一个热点 ... 机器人的应用方式正在由部件式单元应用向系统式应用方向发展 .这是实际应用的需要 ,也是技术发展的必然趋势 ;相关技术如计算机网络技术的发展也为它的实现提供了相应支持 .多机器人协作理论问题必然也已经成为机器人学研究的一个热点 ,其中 ,分布式人工智能 ( DAI)中的多智能体 (代理 )系统 ( MAS:Multi-agentSystem)理论已引起多机器人协作理论研究者的关注 .本文即在揭示协作多机器人系统与 MAS的内在联系的基础上 ,指出基于 MAS的协作多机器人系统是协作多机器人学发展的一个重要方向 . 展开更多
关键词 多机器人系统 多智能体系系统 协作多机器人学 mas 人工智能
下载PDF
抗奥合剂通过p38 MAPK/NF-κB信号通路和ACE2/Ang1-7/Mas轴缓解急性肺损伤研究
7
作者 陈思琪 严佳煜 +1 位作者 李瑞 顾宁 《南京中医药大学学报》 CAS CSCD 北大核心 2024年第5期446-456,共11页
目的探讨抗奥合剂(KAHJ)治疗小鼠急性肺损伤(ALI)的作用及机制,为其可能作为缓解新型冠状病毒(COVID-19)感染后症状的药物提供依据。方法采用网络药理学方法预测KAHJ治疗ALI的主要活性成分、潜在靶点和相关信号通路。将C57BL/6J小鼠随... 目的探讨抗奥合剂(KAHJ)治疗小鼠急性肺损伤(ALI)的作用及机制,为其可能作为缓解新型冠状病毒(COVID-19)感染后症状的药物提供依据。方法采用网络药理学方法预测KAHJ治疗ALI的主要活性成分、潜在靶点和相关信号通路。将C57BL/6J小鼠随机分为对照组、LPS组和LPS+KAHJ组。LPS+KAHJ组小鼠灌胃KAHJ(4.76 g·kg^(-1)·d^(-1),8.8 mL·kg^(-1)·d^(-1)),其余组小鼠灌胃生理盐水(8.8 mL·kg^(-1)·d^(-1))。14 d后,腹腔注射LPS(5 mg·kg^(-1))诱导ALI模型。收集小鼠血清和肺组织,通过组织病理学观察肺组织的病理变化。采用Western blot、qPCR、ELISA和IHC等方法评估KAHJ对ALI的改善作用。结果通过网络药理学筛选出疾病和药物共同的70个核心靶基因,并显示与多个信号通路密切相关,如MAPK、NF-κB、Apoptosis、COVID-19和肾素-血管紧张素系统(Ras)信号通路等。此外,通过实验验证发现KAHJ能改善小鼠ALI后的炎症和细胞凋亡,减少肺损伤和肺水肿,抑制肺纤维化。同时,KAHJ的作用机制与p38 MAPK和NF-κB的磷酸化以及ACE2/Ang1-7/Mas轴的调控也有着密切关系。结论KAHJ可能通过抑制p38 MAPK/NF-κB信号通路和调控ACE2/Ang1-7/Mas轴缓解ALI,为缓解COVID-19感染后症状提供了补充和替代药物。 展开更多
关键词 急性肺损伤 p38 maPK/NF-κB信号通路 ACE2/Ang1-7/mas 新型冠状病毒
下载PDF
Reinforcement Learning-Based MAS Interception in Antagonistic Environments
8
作者 Siqing Sun Defu Cai +1 位作者 Hai-Tao Zhang Ning Xing 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期270-272,共3页
Dear Editor, As a promising multi-agent systems(MASs) operation, autonomous interception has attracted more and more attentions in these years, where defenders prevent intruders from reaching destinations.So far, most... Dear Editor, As a promising multi-agent systems(MASs) operation, autonomous interception has attracted more and more attentions in these years, where defenders prevent intruders from reaching destinations.So far, most of the relevant methods are applied in ideal environments without agent damages. As a remedy, this letter proposes a more realistic interception method for MASs suffered by damages. 展开更多
关键词 AGENT mas DESTINATION
下载PDF
Hyperbolic Tangent Function-Based Protocols for Global/Semi-Global Finite-Time Consensus of Multi-Agent Systems
9
作者 Zongyu Zuo Jingchuan Tang +1 位作者 Ruiqi Ke Qing-Long Han 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第6期1381-1397,共17页
This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global ... This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global finite-time consensus for both single-integrator and double-integrator multi-agent systems with leaderless undirected and leader-following directed commu-nication topologies.These new protocols not only provide an explicit upper-bound estimate for the settling time,but also have a user-prescribed bounded control level.In addition,compared to some existing results based on the saturation function,the pro-posed approach considerably simplifies the protocol design and the stability analysis.Illustrative examples and an application demonstrate the effectiveness of the proposed protocols. 展开更多
关键词 Consensus protocol finite-time consensus hyper-bolic tangent function multi-agent systems.
下载PDF
Targeted multi-agent communication algorithm based on state control
10
作者 Li-yang Zhao Tian-qing Chang +3 位作者 Lei Zhang Jie Zhang Kai-xuan Chu De-peng Kong 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第1期544-556,共13页
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ... As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents. 展开更多
关键词 multi-agent deep reinforcement learning State control Targeted interaction Communication mechanism
下载PDF
An Optimal Control-Based Distributed Reinforcement Learning Framework for A Class of Non-Convex Objective Functionals of the Multi-Agent Network 被引量:2
11
作者 Zhe Chen Ning Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第11期2081-2093,共13页
This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objecti... This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others. The above problem involves complexity simultaneously in the time and space aspects. Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions. In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time. Hence, the minimization of the overall functional belongs to the calculus of variations. Traditional works usually aim to seek the optimal decision function. Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function. This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning(RL) framework. In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection. Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework. 展开更多
关键词 Distributed optimization multi-agent optimal control reinforcement learning(RL)
下载PDF
Computation Tree Logic Model Checking of Multi-Agent Systems Based on Fuzzy Epistemic Interpreted Systems
12
作者 Xia Li Zhanyou Ma +3 位作者 Zhibao Mian Ziyuan Liu Ruiqi Huang Nana He 《Computers, Materials & Continua》 SCIE EI 2024年第3期4129-4152,共24页
Model checking is an automated formal verification method to verify whether epistemic multi-agent systems adhere to property specifications.Although there is an extensive literature on qualitative properties such as s... Model checking is an automated formal verification method to verify whether epistemic multi-agent systems adhere to property specifications.Although there is an extensive literature on qualitative properties such as safety and liveness,there is still a lack of quantitative and uncertain property verifications for these systems.In uncertain environments,agents must make judicious decisions based on subjective epistemic.To verify epistemic and measurable properties in multi-agent systems,this paper extends fuzzy computation tree logic by introducing epistemic modalities and proposing a new Fuzzy Computation Tree Logic of Knowledge(FCTLK).We represent fuzzy multi-agent systems as distributed knowledge bases with fuzzy epistemic interpreted systems.In addition,we provide a transformation algorithm from fuzzy epistemic interpreted systems to fuzzy Kripke structures,as well as transformation rules from FCTLK formulas to Fuzzy Computation Tree Logic(FCTL)formulas.Accordingly,we transform the FCTLK model checking problem into the FCTL model checking.This enables the verification of FCTLK formulas by using the fuzzy model checking algorithm of FCTL without additional computational overheads.Finally,we present correctness proofs and complexity analyses of the proposed algorithms.Additionally,we further illustrate the practical application of our approach through an example of a train control system. 展开更多
关键词 Model checking multi-agent systems fuzzy epistemic interpreted systems fuzzy computation tree logic transformation algorithm
下载PDF
An Improved Bounded Conflict-Based Search for Multi-AGV Pathfinding in Automated Container Terminals
13
作者 Xinci Zhou Jin Zhu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期2705-2727,共23页
As the number of automated guided vehicles(AGVs)within automated container terminals(ACT)continues to rise,conflicts have becomemore frequent.Addressing point and edge conflicts ofAGVs,amulti-AGVconflict-free path pla... As the number of automated guided vehicles(AGVs)within automated container terminals(ACT)continues to rise,conflicts have becomemore frequent.Addressing point and edge conflicts ofAGVs,amulti-AGVconflict-free path planning model has been formulated to minimize the total path length of AGVs between shore bridges and yards.For larger terminalmaps and complex environments,the grid method is employed to model AGVs’road networks.An improved bounded conflict-based search(IBCBS)algorithmtailored to ACT is proposed,leveraging the binary tree principle to resolve conflicts and employing focal search to expand the search range.Comparative experiments involving 60 AGVs indicate a reduction in computing time by 37.397%to 64.06%while maintaining the over cost within 1.019%.Numerical experiments validate the proposed algorithm’s efficacy in enhancing efficiency and ensuring solution quality. 展开更多
关键词 Automated terminals multi-agV multi-agent path finding(maPF) conflict based search(CBS) AGV path planning
下载PDF
Experimental aspects of ^(14)N overtone RESPDOR solid-state NMR spectroscopy under MAS beyond 60 kHz
14
作者 Yutaro Ogaeri Yusuke Nishiyama 《Magnetic Resonance Letters》 2024年第1期40-49,共10页
Nitrogen-14(^(14)N)overtone(OT)spectroscopy under fast magic angle spinning(MAS)conditions(>60 kHz)has emerged as a powerful technique for observing correlations and distances between ^(14)N and ^(1)H,owing to the ... Nitrogen-14(^(14)N)overtone(OT)spectroscopy under fast magic angle spinning(MAS)conditions(>60 kHz)has emerged as a powerful technique for observing correlations and distances between ^(14)N and ^(1)H,owing to the absence of the first-order quadrupolar broadenings.In addition,^(14)N^(OT) allows selective manipulation of ^(14)N nuclei for each site.Despite extensive theoretical and experimental studies,the spin dynamics of ^(14)N^(OT) remains under debate.In this study,we conducted experimental investigations to assess the spin dynamics of ^(14)N^(OT) using the rotational-echo saturation-pulse double-resonance(RESPDOR)sequence,which monitors population transfer induced by a^(14)N^(OT) pulse.The ^(14)N^(OT) spin dynamics is well represented by a model of a two-energy-level system.Unlike spin-1/2,the maximum excitation efficiency of ^(14)N^(OT) coherences of powdered solids,denoted by p,depends on the radiofrequency field(rf-field)strength due to orientation dependence of effective nutation fields even when pulse lengths are optimized.It is also found that the p factor,contributing to the ^(14)N^(OT) spin dynamics,is nearly independent of the B0 field.Consequently,the filtering efficiency of RESPDOR experiments exhibits negligible dependence on B0 when the ^(14)N^(OT) pulse length is optimized.The study also identifies the optimal experimental conditions for ^(14)N^(OT)/^(1)H RESPDOR correlation experiments. 展开更多
关键词 ^(14)N OVERTONE RESPDOR ^(14)N/^(1)H correlation Solid-state NMR Fast mas
下载PDF
Connectivity-maintaining Consensus of Multi-agent Systems With Communication Management Based on Predictive Control Strategy
15
作者 Jie Wang Shaoyuan Li Yuanyuan Zou 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第3期700-710,共11页
This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication manag... This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication management strategy is proposed for multi-agent systems so that the connectivity of the system can be maintained and the communication energy can be saved.In this paper,communication management means a strategy about how the sensing ranges of agents are adjusted in the process of reaching consensus.The proposed communication management in this paper is not coupled with controller but only imposes a constraint for controller,so there is more freedom to develop an appropriate control strategy for achieving consensus.For the multi-agent systems with this novel communication management,a predictive control based strategy is developed for achieving consensus.Simulation results indicate the effectiveness and advantages of our scheme. 展开更多
关键词 CONSENSUS ENERGY-SAVING multi-agent system predictive control
下载PDF
MAQMC:Multi-Agent Deep Q-Network for Multi-Zone Residential HVAC Control
16
作者 Zhengkai Ding Qiming Fu +4 位作者 Jianping Chen You Lu Hongjie Wu Nengwei Fang Bin Xing 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第9期2759-2785,共27页
The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep r... The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments. 展开更多
关键词 Deep reinforcement learning multi-zone residential HVAC multi-agent energy conservation COMFORT
下载PDF
Lyapunov-Based Output Containment Control of Heterogeneous Multi-Agent Systems With Markovian Switching Topologies and Distributed Delays
17
作者 Haihua Guo Min Meng Gang Feng 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第6期1421-1433,共13页
This paper considers the mean square output containment control problem for heterogeneous multi-agent systems(MASs)with randomly switching topologies and nonuniform distributed delays.By modeling the switching topolog... This paper considers the mean square output containment control problem for heterogeneous multi-agent systems(MASs)with randomly switching topologies and nonuniform distributed delays.By modeling the switching topologies as a continuous-time Markov process and taking the distributed delays into consideration,a novel distributed containment observer is proposed to estimate the convex hull spanned by the leaders'states.A novel distributed output feedback containment controller is then designed without using the prior knowledge of distributed delays.By constructing a novel switching Lyapunov functional,the output containment control problem is then solved in the sense of mean square under an easily-verifiable sufficient condition.Finally,two numerical examples are given to show the effectiveness of the proposed controller. 展开更多
关键词 Heterogeneous multi-agent systems Lyapunov method markovian switching topologies output containment control time delays
下载PDF
Privacy Preserving Demand Side Management Method via Multi-Agent Reinforcement Learning
18
作者 Feiye Zhang Qingyu Yang Dou An 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第10期1984-1999,共16页
The smart grid utilizes the demand side management technology to motivate energy users towards cutting demand during peak power consumption periods, which greatly improves the operation efficiency of the power grid. H... The smart grid utilizes the demand side management technology to motivate energy users towards cutting demand during peak power consumption periods, which greatly improves the operation efficiency of the power grid. However, as the number of energy users participating in the smart grid continues to increase, the demand side management strategy of individual agent is greatly affected by the dynamic strategies of other agents. In addition, the existing demand side management methods, which need to obtain users’ power consumption information,seriously threaten the users’ privacy. To address the dynamic issue in the multi-microgrid demand side management model, a novel multi-agent reinforcement learning method based on centralized training and decentralized execution paradigm is presented to mitigate the damage of training performance caused by the instability of training experience. In order to protect users’ privacy, we design a neural network with fixed parameters as the encryptor to transform the users’ energy consumption information from low-dimensional to high-dimensional and theoretically prove that the proposed encryptor-based privacy preserving method will not affect the convergence property of the reinforcement learning algorithm. We verify the effectiveness of the proposed demand side management scheme with the real-world energy consumption data of Xi’an, Shaanxi, China. Simulation results show that the proposed method can effectively improve users’ satisfaction while reducing the bill payment compared with traditional reinforcement learning(RL) methods(i.e., deep Q learning(DQN), deep deterministic policy gradient(DDPG),QMIX and multi-agent deep deterministic policy gradient(MADDPG)). The results also demonstrate that the proposed privacy protection scheme can effectively protect users’ privacy while ensuring the performance of the algorithm. 展开更多
关键词 Centralized training and decentralized execution demand side management multi-agent reinforcement learning privacy preserving
下载PDF
Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning
19
作者 Jiawei Xia Yasong Luo +3 位作者 Zhikun Liu Yalun Zhang Haoran Shi Zhong Liu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第11期80-94,共15页
To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit... To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified. 展开更多
关键词 Unmanned surface vehicles multi-agent deep reinforcement learning Cooperative hunting Feature embedding Proximal policy optimization
下载PDF
Multi-Agent Hierarchical Graph Attention Reinforcement Learning for Grid-Aware Energy Management
20
作者 FENG Bingyi FENG Mingxiao +2 位作者 WANG Minrui ZHOU Wengang LI Houqiang 《ZTE Communications》 2023年第3期11-21,共11页
The increasing adoption of renewable energy has posed challenges for voltage regulation in power distribution networks.Gridaware energy management,which includes the control of smart inverters and energy management sy... The increasing adoption of renewable energy has posed challenges for voltage regulation in power distribution networks.Gridaware energy management,which includes the control of smart inverters and energy management systems,is a trending way to mitigate this problem.However,existing multi-agent reinforcement learning methods for grid-aware energy management have not sufficiently considered the importance of agent cooperation and the unique characteristics of the grid,which leads to limited performance.In this study,we propose a new approach named multi-agent hierarchical graph attention reinforcement learning framework(MAHGA)to stabilize the voltage.Specifically,under the paradigm of centralized training and decentralized execution,we model the power distribution network as a novel hierarchical graph containing the agent-level topology and the bus-level topology.Then a hierarchical graph attention model is devised to capture the complex correlation between agents.Moreover,we incorporate graph contrastive learning as an auxiliary task in the reinforcement learning process to improve representation learning from graphs.Experiments on several real-world scenarios reveal that our approach achieves the best performance and can reduce the number of voltage violations remarkably. 展开更多
关键词 demand-side management graph neural networks multi-agent reinforcement learning voltage regulation
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部