This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight...This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.展开更多
This paper investigates the robust cooperative output regulation problem for a class of heterogeneousuncertain linear multi-agent systems with an unknown exosystem via event-triggered control (ETC). By utilizingthe in...This paper investigates the robust cooperative output regulation problem for a class of heterogeneousuncertain linear multi-agent systems with an unknown exosystem via event-triggered control (ETC). By utilizingthe internal model approach and the adaptive control technique, a distributed adaptive internal model isconstructed for each agent. Then, based on this internal model, a fully distributed ETC strategy composed ofa distributed event-triggered adaptive output feedback control law and a distributed dynamic event-triggeringmechanism is proposed, in which each agent updates its control input at its own triggering time instants. It isshown that under the proposed ETC strategy, the robust cooperative output regulation problem can be solvedwithout requiring either the global information associated with the communication topology or the bounds ofthe uncertain or unknown parameters in each agent and the exosystem. A numerical example is provided toillustrate the effectiveness of the proposed control strategy.展开更多
This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eli...This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.展开更多
Battery energy storage systems(BESSs)are widely used in smart grids.However,power consumed by inner impedance and the capacity degradation of each battery unit become particularly severe,which has resulted in an incre...Battery energy storage systems(BESSs)are widely used in smart grids.However,power consumed by inner impedance and the capacity degradation of each battery unit become particularly severe,which has resulted in an increase in operating costs.The general economic dispatch(ED)algorithm based on marginal cost(MC)consensus is usually a proportional(P)controller,which encounters the defects of slow convergence speed and low control accuracy.In order to solve the distributed ED problem of the isolated BESS network with excellent dynamic and steady-state performance,we attempt to design a proportional integral(PI)controller with a reset mechanism(PI+R)to asymptotically promote MC consensus and total power mismatch towards 0 in this paper.To be frank,the integral term in the PI controller is reset to 0 at an appropriate time when the proportional term undergoes a zero crossing,which accelerates convergence,improves control accuracy,and avoids overshoot.The eigenvalues of the system under a PI+R controller is well analyzed,ensuring the regularity of the system and enabling the reset mechanism.To ensure supply and demand balance within the isolated BESSs,a centralized reset mechanism is introduced,so that the controller is distributed in a flow set and centralized in a jump set.To cope with Zeno behavior and input delay,a dwell time that the system resides in a flow set is given.Based on this,the system with input delays can be reduced to a time-delay free system.Considering the capacity limitation of the battery,a modified MC scheme with PI+R controller is designed.The correctness of the designed scheme is verified through relevant simulations.展开更多
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ...As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.展开更多
This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature i...This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature in modern power grids.To tackle the unique challenges of voltage control in distributed renewable energy networks,researchers are increasingly turning towards multi-agent reinforcement learning(MARL).However,MARL raises safety concerns due to the unpredictability in agent actions during their exploration phase.This unpredictability can lead to unsafe control measures.To mitigate these safety concerns in MARL-based voltage control,our study introduces a novel approach:Safety-ConstrainedMulti-Agent Reinforcement Learning(SC-MARL).This approach incorporates a specialized safety constraint module specifically designed for voltage control within the MARL framework.This module ensures that the MARL agents carry out voltage control actions safely.The experiments demonstrate that,in the 33-buses,141-buses,and 322-buses power systems,employing SC-MARL for voltage control resulted in a reduction of the Voltage Out of Control Rate(%V.out)from0.43,0.24,and 2.95 to 0,0.01,and 0.03,respectively.Additionally,the Reactive Power Loss(Q loss)decreased from 0.095,0.547,and 0.017 to 0.062,0.452,and 0.016 in the corresponding systems.展开更多
This paper presents a distributed scheme with limited communications, aiming to achieve cooperative motion control for multiple omnidirectional mobile manipulators(MOMMs).The proposed scheme extends the existing singl...This paper presents a distributed scheme with limited communications, aiming to achieve cooperative motion control for multiple omnidirectional mobile manipulators(MOMMs).The proposed scheme extends the existing single-agent motion control to cater to scenarios involving the cooperative operation of MOMMs. Specifically, squeeze-free cooperative load transportation is achieved for the end-effectors of MOMMs by incorporating cooperative repetitive motion planning(CRMP), while guiding each individual to desired poses. Then, the distributed scheme is formulated as a time-varying quadratic programming(QP) and solved online utilizing a noise-tolerant zeroing neural network(NTZNN). Theoretical analysis shows that the NTZNN model converges globally to the optimal solution of QP in the presence of noise. Finally, the effectiveness of the control design is demonstrated by numerical simulations and physical platform experiments.展开更多
In this article,lane change models for mixed traffic flow under cooperative adaptive cruise control(CACC)platoon formation are established.The analysis begins by examining the impact of lane changes on traffic flow st...In this article,lane change models for mixed traffic flow under cooperative adaptive cruise control(CACC)platoon formation are established.The analysis begins by examining the impact of lane changes on traffic flow stability.The influences of various factors such as lane change locations,timing,and the current traffic state on stability are discussed.In this analysis,it is assumed that the lane change location and the entry position in the adjacent lane have already been selected,without considering the specific intention behind the lane change.The speeds of the involved vehicles are adjusted based on an existing lane change model,and various conditions are analyzed for traffic flow disturbances,including duration,shock amplitude,and driving delays.Numerical calculations are provided to illustrate these effects.Additionally,traffic flow stability is factored into the lane change decision-making process.By incorporating disturbances to the fleet into the lane change income model,both a lane change intention model and a lane change execution model are constructed.These models are then compared with a model that does not account for stability,leading to the corresponding conclusions.展开更多
In the existing formation model,vehicles in the same lane or adjacent lane are regarded as the structure,and the driving behavior of vehicles is studied from the perspectives of safety,speed consistency,and stability,...In the existing formation model,vehicles in the same lane or adjacent lane are regarded as the structure,and the driving behavior of vehicles is studied from the perspectives of safety,speed consistency,and stability,and the speed control model is proposed from the perspective of vehicles themselves,to obtain a stable fleet with the same distance and speed.However,in this process,the initial condition of the vehicle,the traffic flow environment,and the efficiency of the fleet formation are less considered.Therefore,based on summarizing the existing fleet building model,this paper puts forward the rapid construction model and algorithm of a cooperative adaptive cruise control platoon fleet.One of the important goals of forming a team is to enter the team with the smoothest trajectory in the shortest time.Therefore,this chapter studies the trajectory optimization of the vehicle formation process from the perspective of vehicle dynamics.展开更多
The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked age...The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked agents. A theorem in the form of linear matrix inequalities(LMI) is derived to analyze the system stability. An- other theorem in the form of optimization problem subject to LMI constraints is proposed to design the controller, and then the algorithm is presented. The simulation results verify the validity and the effectiveness of the pro- posed approach.展开更多
To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit...To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.展开更多
This paper addresses the cooperative control problem of multiple unmanned aerial vehicles(multi-UAV)systems.First,a new distributed consensus algorithm for second-order nonlinear multi-agent systems(MAS)is formulated ...This paper addresses the cooperative control problem of multiple unmanned aerial vehicles(multi-UAV)systems.First,a new distributed consensus algorithm for second-order nonlinear multi-agent systems(MAS)is formulated under the leader-following approach.The algorithm provides smooth input signals to the agents’control channels,which avoids the chattering effect generated by the conventional sliding mode-based control protocols.Second,a new formation control scheme is developed by integrating smooth distributed consensus control protocols into the geometric pattern model to achieve three-dimensional formation tracking.The Lyapunov theory is used to prove the stability and convergence of both distributed consensus and formation controllers.The effectiveness of the proposed algorithms is demonstrated through simulation results.展开更多
Among the promising application of autonomous surface vessels(ASVs)is the utilization of multiple autonomous tugs for manipulating a floating object such as an oil platform,a broken ship,or a ship in port areas.Consid...Among the promising application of autonomous surface vessels(ASVs)is the utilization of multiple autonomous tugs for manipulating a floating object such as an oil platform,a broken ship,or a ship in port areas.Considering the real conditions and operations of maritime practice,this paper proposes a multi-agent control algorithm to manipulate a ship to a desired position with a desired heading and velocity under the environmental disturbances.The control architecture consists of a supervisory controller in the higher layer and tug controllers in the lower layer.The supervisory controller allocates the towing forces and angles between the tugs and the ship by minimizing the error in the position and velocity of the ship.The weight coefficients in the cost function are designed to be adaptive to guarantee that the towing system functions well under environmental disturbances,and to enhance the efficiency of the towing system.The tug controller provides the forces to tow the ship and tracks the reference trajectory that is computed online based on the towing angles calculated by the supervisory controller.Simulation results show that the proposed algorithm can make the two autonomous tugs cooperatively tow a ship to a desired position with a desired heading and velocity under the(even harsh)environmental disturbances.展开更多
This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of un...This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of unknown nominal dynamics and also subject to external disturbances and/or unmodeled dynamics. Anovel distributed robust adaptive control strategy is proposed. It is shown that the robust adaptive leaderlessconsensus problem is solved with the proposed control strategy under some sufficient conditions. Two examplesare provided to demonstrate the efficacy of the proposed control strategy.展开更多
In this paper,a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for a multi-agent system with heterogeneous dynamic uncertainties and disturbances over a directed graph...In this paper,a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for a multi-agent system with heterogeneous dynamic uncertainties and disturbances over a directed graph.It is of two-degree-of-freedom nature.Specifically,a robust distributed controller is designed for consensus tracking,while a local disturbance estimator is designed for each agent without requiring the input channel information of disturbances.The condition for asymptotic disturbance rejection is derived.Moreover,even when the disturbance model is not exactly known,the developed method also provides good disturbance-rejection performance.Then,a robust stabilization condition with less conservativeness is derived for the whole multi-agent system.Further,a design algorithm is given.Finally,comparisons with the conventional one-degree-of-freedombased distributed disturbance-rejection method for mismatched disturbances and the distributed extended-state observer for matched disturbances validate the developed method.展开更多
Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net...Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.展开更多
This paper examines the performance of Full-Duplex Cooperative Rate Splitting(FD-CRS)with Simultaneous Wireless Information and Power Transfer(SWIPT)support in Multiple Input Single Output(MISO)networks.In a Rate Spli...This paper examines the performance of Full-Duplex Cooperative Rate Splitting(FD-CRS)with Simultaneous Wireless Information and Power Transfer(SWIPT)support in Multiple Input Single Output(MISO)networks.In a Rate Splitting Multiple Access(RSMA)multicast system with two local users and one remote user,the common data stream contains the needs of all users,and all users can decode the common data stream.Therefore,each user can receive some information that other users need,and local users with better channel conditions can use this information to further enhance the reception reliability and data rate of users with poor channel quality.Even using Cell-Center-Users(CCUs)as a cooperative relay to assist the transmission of common data can improve the average system speed.To maximize the minimum achievable rate,we optimize the beamforming vector of Base Station(BS),the common streamsplitting vector,the cooperative distributed beamvector and the strong user transmission power under the power budget constraints of BS and relay devices and the service quality requirements constraints of users.Since the whole problem is not convex,we cannot solve it directly.Therefore,we propose a low complexity algorithm based on Successive Convex Approximation(SCA)technology to find the optimal solution to the problemunder consideration.The simulation results show that FD C-RSMA has better gain andmore powerful than FD C-NOMA,HD C-RSMA,RSMA and NOMA.展开更多
This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objecti...This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others. The above problem involves complexity simultaneously in the time and space aspects. Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions. In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time. Hence, the minimization of the overall functional belongs to the calculus of variations. Traditional works usually aim to seek the optimal decision function. Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function. This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning(RL) framework. In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection. Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework.展开更多
With the new characteristics of global cooperation in supply chains being synthetically considered,a hybrid model to the cooperative negotiation process for the order distribution in supply chain is mainly studied.Aft...With the new characteristics of global cooperation in supply chains being synthetically considered,a hybrid model to the cooperative negotiation process for the order distribution in supply chain is mainly studied.After reviewing and analyzing some main domestic and overseas processes in cooperative negotiation modeling in supply chain,some problems are subsequently pointed out.For example,the traditional simple multi-agent system(MAS)frameworks which have some limitations,are not suitable for solving modeling complex systems.To solve these problems,thinking with the aid of the multi-agent structure and complex system modeling,the manufacturing supply chain is taken as an example,and a time Petri net production model is adopted to decompose the materials.And then a cooperative negotiation model for the order distribution in supply chain is constructed based on combining multi-agent techniques with time Petri net modeling.The simulation results reveal that the above model helps solve the problems of cooperative negotiation in supply chains.展开更多
This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication manag...This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication management strategy is proposed for multi-agent systems so that the connectivity of the system can be maintained and the communication energy can be saved.In this paper,communication management means a strategy about how the sensing ranges of agents are adjusted in the process of reaching consensus.The proposed communication management in this paper is not coupled with controller but only imposes a constraint for controller,so there is more freedom to develop an appropriate control strategy for achieving consensus.For the multi-agent systems with this novel communication management,a predictive control based strategy is developed for achieving consensus.Simulation results indicate the effectiveness and advantages of our scheme.展开更多
基金supported by the National Science and Technology Major Project (2021ZD0112702)the National Natural Science Foundation (NNSF)of China (62373100,62233003)the Natural Science Foundation of Jiangsu Province of China (BK20202006)。
文摘This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.
基金the National Natural Science Foundation of China(NSFC)-Excellent Young Scientists Fund(Hong Kong and Macao)under Grant 62222318.
文摘This paper investigates the robust cooperative output regulation problem for a class of heterogeneousuncertain linear multi-agent systems with an unknown exosystem via event-triggered control (ETC). By utilizingthe internal model approach and the adaptive control technique, a distributed adaptive internal model isconstructed for each agent. Then, based on this internal model, a fully distributed ETC strategy composed ofa distributed event-triggered adaptive output feedback control law and a distributed dynamic event-triggeringmechanism is proposed, in which each agent updates its control input at its own triggering time instants. It isshown that under the proposed ETC strategy, the robust cooperative output regulation problem can be solvedwithout requiring either the global information associated with the communication topology or the bounds ofthe uncertain or unknown parameters in each agent and the exosystem. A numerical example is provided toillustrate the effectiveness of the proposed control strategy.
基金the National Natural Science Foundation of China(62203356)Fundamental Research Funds for the Central Universities of China(31020210502002)。
文摘This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.
基金supported by the National Natural Science Foundation of China(62103203)the General Terminal IC Interdisciplinary Science Center of Nankai University.
文摘Battery energy storage systems(BESSs)are widely used in smart grids.However,power consumed by inner impedance and the capacity degradation of each battery unit become particularly severe,which has resulted in an increase in operating costs.The general economic dispatch(ED)algorithm based on marginal cost(MC)consensus is usually a proportional(P)controller,which encounters the defects of slow convergence speed and low control accuracy.In order to solve the distributed ED problem of the isolated BESS network with excellent dynamic and steady-state performance,we attempt to design a proportional integral(PI)controller with a reset mechanism(PI+R)to asymptotically promote MC consensus and total power mismatch towards 0 in this paper.To be frank,the integral term in the PI controller is reset to 0 at an appropriate time when the proportional term undergoes a zero crossing,which accelerates convergence,improves control accuracy,and avoids overshoot.The eigenvalues of the system under a PI+R controller is well analyzed,ensuring the regularity of the system and enabling the reset mechanism.To ensure supply and demand balance within the isolated BESSs,a centralized reset mechanism is introduced,so that the controller is distributed in a flow set and centralized in a jump set.To cope with Zeno behavior and input delay,a dwell time that the system resides in a flow set is given.Based on this,the system with input delays can be reduced to a time-delay free system.Considering the capacity limitation of the battery,a modified MC scheme with PI+R controller is designed.The correctness of the designed scheme is verified through relevant simulations.
文摘As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.
基金“Regional Innovation Strategy(RIS)”through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(MOE)(2021RIS-002).
文摘This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature in modern power grids.To tackle the unique challenges of voltage control in distributed renewable energy networks,researchers are increasingly turning towards multi-agent reinforcement learning(MARL).However,MARL raises safety concerns due to the unpredictability in agent actions during their exploration phase.This unpredictability can lead to unsafe control measures.To mitigate these safety concerns in MARL-based voltage control,our study introduces a novel approach:Safety-ConstrainedMulti-Agent Reinforcement Learning(SC-MARL).This approach incorporates a specialized safety constraint module specifically designed for voltage control within the MARL framework.This module ensures that the MARL agents carry out voltage control actions safely.The experiments demonstrate that,in the 33-buses,141-buses,and 322-buses power systems,employing SC-MARL for voltage control resulted in a reduction of the Voltage Out of Control Rate(%V.out)from0.43,0.24,and 2.95 to 0,0.01,and 0.03,respectively.Additionally,the Reactive Power Loss(Q loss)decreased from 0.095,0.547,and 0.017 to 0.062,0.452,and 0.016 in the corresponding systems.
基金supported in part by the National Natural Science Foundation of China (62373065,61873304,62173048,62106023)the Innovation and Entrepreneurship Talent funding Project of Jilin Province(2022QN04)+1 种基金the Changchun Science and Technology Project (21ZY41)the Open Research Fund of National Mobile Communications Research Laboratory,Southeast University (2024D09)。
文摘This paper presents a distributed scheme with limited communications, aiming to achieve cooperative motion control for multiple omnidirectional mobile manipulators(MOMMs).The proposed scheme extends the existing single-agent motion control to cater to scenarios involving the cooperative operation of MOMMs. Specifically, squeeze-free cooperative load transportation is achieved for the end-effectors of MOMMs by incorporating cooperative repetitive motion planning(CRMP), while guiding each individual to desired poses. Then, the distributed scheme is formulated as a time-varying quadratic programming(QP) and solved online utilizing a noise-tolerant zeroing neural network(NTZNN). Theoretical analysis shows that the NTZNN model converges globally to the optimal solution of QP in the presence of noise. Finally, the effectiveness of the control design is demonstrated by numerical simulations and physical platform experiments.
文摘In this article,lane change models for mixed traffic flow under cooperative adaptive cruise control(CACC)platoon formation are established.The analysis begins by examining the impact of lane changes on traffic flow stability.The influences of various factors such as lane change locations,timing,and the current traffic state on stability are discussed.In this analysis,it is assumed that the lane change location and the entry position in the adjacent lane have already been selected,without considering the specific intention behind the lane change.The speeds of the involved vehicles are adjusted based on an existing lane change model,and various conditions are analyzed for traffic flow disturbances,including duration,shock amplitude,and driving delays.Numerical calculations are provided to illustrate these effects.Additionally,traffic flow stability is factored into the lane change decision-making process.By incorporating disturbances to the fleet into the lane change income model,both a lane change intention model and a lane change execution model are constructed.These models are then compared with a model that does not account for stability,leading to the corresponding conclusions.
文摘In the existing formation model,vehicles in the same lane or adjacent lane are regarded as the structure,and the driving behavior of vehicles is studied from the perspectives of safety,speed consistency,and stability,and the speed control model is proposed from the perspective of vehicles themselves,to obtain a stable fleet with the same distance and speed.However,in this process,the initial condition of the vehicle,the traffic flow environment,and the efficiency of the fleet formation are less considered.Therefore,based on summarizing the existing fleet building model,this paper puts forward the rapid construction model and algorithm of a cooperative adaptive cruise control platoon fleet.One of the important goals of forming a team is to enter the team with the smoothest trajectory in the shortest time.Therefore,this chapter studies the trajectory optimization of the vehicle formation process from the perspective of vehicle dynamics.
基金Supported by the National Natural Science Foundation of China(91016017)the National Aviation Found of China(20115868009)~~
文摘The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked agents. A theorem in the form of linear matrix inequalities(LMI) is derived to analyze the system stability. An- other theorem in the form of optimization problem subject to LMI constraints is proposed to design the controller, and then the algorithm is presented. The simulation results verify the validity and the effectiveness of the pro- posed approach.
基金financial support from National Natural Science Foundation of China(Grant No.61601491)Natural Science Foundation of Hubei Province,China(Grant No.2018CFC865)Military Research Project of China(-Grant No.YJ2020B117)。
文摘To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.
基金This work was supported by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah(G-363-135-1438).
文摘This paper addresses the cooperative control problem of multiple unmanned aerial vehicles(multi-UAV)systems.First,a new distributed consensus algorithm for second-order nonlinear multi-agent systems(MAS)is formulated under the leader-following approach.The algorithm provides smooth input signals to the agents’control channels,which avoids the chattering effect generated by the conventional sliding mode-based control protocols.Second,a new formation control scheme is developed by integrating smooth distributed consensus control protocols into the geometric pattern model to achieve three-dimensional formation tracking.The Lyapunov theory is used to prove the stability and convergence of both distributed consensus and formation controllers.The effectiveness of the proposed algorithms is demonstrated through simulation results.
基金supported by the China Scholarship Council(201806950080)the Researchlab Autonomous Shipping(RAS)of Delft University of Technology,and the INTERREG North Sea Region Grant“AVATAR”funded by the European Regional Development Fund.
文摘Among the promising application of autonomous surface vessels(ASVs)is the utilization of multiple autonomous tugs for manipulating a floating object such as an oil platform,a broken ship,or a ship in port areas.Considering the real conditions and operations of maritime practice,this paper proposes a multi-agent control algorithm to manipulate a ship to a desired position with a desired heading and velocity under the environmental disturbances.The control architecture consists of a supervisory controller in the higher layer and tug controllers in the lower layer.The supervisory controller allocates the towing forces and angles between the tugs and the ship by minimizing the error in the position and velocity of the ship.The weight coefficients in the cost function are designed to be adaptive to guarantee that the towing system functions well under environmental disturbances,and to enhance the efficiency of the towing system.The tug controller provides the forces to tow the ship and tracks the reference trajectory that is computed online based on the towing angles calculated by the supervisory controller.Simulation results show that the proposed algorithm can make the two autonomous tugs cooperatively tow a ship to a desired position with a desired heading and velocity under the(even harsh)environmental disturbances.
基金Research Grants Council of Hong Kong under Grant CityU-11205221.
文摘This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of unknown nominal dynamics and also subject to external disturbances and/or unmodeled dynamics. Anovel distributed robust adaptive control strategy is proposed. It is shown that the robust adaptive leaderlessconsensus problem is solved with the proposed control strategy under some sufficient conditions. Two examplesare provided to demonstrate the efficacy of the proposed control strategy.
基金supported by the National Natural Science Foundation of China(62003010,61873006,61673053)the Beijing Postdoctoral Research Foundation(Q6041001202001)+1 种基金the Postdoctoral Research Foundation of Chaoyang District(Q1041001202101)the National Key Research and Development Project(2018YFC1602704,2018YFB1702704)。
文摘In this paper,a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for a multi-agent system with heterogeneous dynamic uncertainties and disturbances over a directed graph.It is of two-degree-of-freedom nature.Specifically,a robust distributed controller is designed for consensus tracking,while a local disturbance estimator is designed for each agent without requiring the input channel information of disturbances.The condition for asymptotic disturbance rejection is derived.Moreover,even when the disturbance model is not exactly known,the developed method also provides good disturbance-rejection performance.Then,a robust stabilization condition with less conservativeness is derived for the whole multi-agent system.Further,a design algorithm is given.Finally,comparisons with the conventional one-degree-of-freedombased distributed disturbance-rejection method for mismatched disturbances and the distributed extended-state observer for matched disturbances validate the developed method.
基金supported in part by the National Key Laboratory of Air-based Information Perception and Fusion and the Aeronautical Science Foundation of China (Grant No. 20220001068001)National Natural Science Foundation of China (Grant No.61673327)+1 种基金Natural Science Basic Research Plan in Shaanxi Province,China (Grant No. 2023-JC-QN-0733)China IndustryUniversity-Research Innovation Foundation (Grant No. 2022IT188)。
文摘Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.
基金This work is supported by Special Fund Project for Technology Innovation of Xuzhou City in 2022(KC22083)Jiangsu Province Key Research and Development(Modern Agriculture)Project(BE2019333)and(BE2019334)+1 种基金Guangzhou Basic Research Program Municipal School(College)Joint Funding Project underGrant 2023A03J0111Innovation Project of Jiangsu Province(SJCK21_1133).
文摘This paper examines the performance of Full-Duplex Cooperative Rate Splitting(FD-CRS)with Simultaneous Wireless Information and Power Transfer(SWIPT)support in Multiple Input Single Output(MISO)networks.In a Rate Splitting Multiple Access(RSMA)multicast system with two local users and one remote user,the common data stream contains the needs of all users,and all users can decode the common data stream.Therefore,each user can receive some information that other users need,and local users with better channel conditions can use this information to further enhance the reception reliability and data rate of users with poor channel quality.Even using Cell-Center-Users(CCUs)as a cooperative relay to assist the transmission of common data can improve the average system speed.To maximize the minimum achievable rate,we optimize the beamforming vector of Base Station(BS),the common streamsplitting vector,the cooperative distributed beamvector and the strong user transmission power under the power budget constraints of BS and relay devices and the service quality requirements constraints of users.Since the whole problem is not convex,we cannot solve it directly.Therefore,we propose a low complexity algorithm based on Successive Convex Approximation(SCA)technology to find the optimal solution to the problemunder consideration.The simulation results show that FD C-RSMA has better gain andmore powerful than FD C-NOMA,HD C-RSMA,RSMA and NOMA.
基金supported in part by the National Natural Science Foundation of China(NSFC)(61773260)the Ministry of Science and Technology (2018YFB130590)。
文摘This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others. The above problem involves complexity simultaneously in the time and space aspects. Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions. In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time. Hence, the minimization of the overall functional belongs to the calculus of variations. Traditional works usually aim to seek the optimal decision function. Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function. This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning(RL) framework. In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection. Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework.
基金The National Natural Science Foundation of China(No.70401013)the National Key Technology R&D Program of China during the 11th Five-Year Plan Period(No.2006BAH02A06)
文摘With the new characteristics of global cooperation in supply chains being synthetically considered,a hybrid model to the cooperative negotiation process for the order distribution in supply chain is mainly studied.After reviewing and analyzing some main domestic and overseas processes in cooperative negotiation modeling in supply chain,some problems are subsequently pointed out.For example,the traditional simple multi-agent system(MAS)frameworks which have some limitations,are not suitable for solving modeling complex systems.To solve these problems,thinking with the aid of the multi-agent structure and complex system modeling,the manufacturing supply chain is taken as an example,and a time Petri net production model is adopted to decompose the materials.And then a cooperative negotiation model for the order distribution in supply chain is constructed based on combining multi-agent techniques with time Petri net modeling.The simulation results reveal that the above model helps solve the problems of cooperative negotiation in supply chains.
基金supported by the National Key Research and Development Program of China(2018AAA0101701)the National Natural Science Foundation of China(62173224,61833012)。
文摘This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication management strategy is proposed for multi-agent systems so that the connectivity of the system can be maintained and the communication energy can be saved.In this paper,communication management means a strategy about how the sensing ranges of agents are adjusted in the process of reaching consensus.The proposed communication management in this paper is not coupled with controller but only imposes a constraint for controller,so there is more freedom to develop an appropriate control strategy for achieving consensus.For the multi-agent systems with this novel communication management,a predictive control based strategy is developed for achieving consensus.Simulation results indicate the effectiveness and advantages of our scheme.