Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experi...Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.展开更多
This paper presents a discrete-time attitude control strategy with equi-global practical stabilizability for aligning the attitude of multiple spacecraft to a predesigned configuration according to a time-variant refe...This paper presents a discrete-time attitude control strategy with equi-global practical stabilizability for aligning the attitude of multiple spacecraft to a predesigned configuration according to a time-variant reference.By utilizing the interference of the wireless channel,the communication scheme designed in this paper can save communication resources,amount of computation,and energy proportionally to the number of spacecraft.The exact discrete-time model and approximate discrete-time model of the consensus-based spacecraft tracking system are given.Then the framework for the design of an event-triggered control scheme for the exact discrete-time system via its approximate models is developed,which avoids the periodic actuation,and Zeno behavior is proved to be excluded.Furthermore,the control scheme can handle the presence of the unknown fading channel.Finally,simulation results are presented to demonstrate the effectiveness of the control strategy.展开更多
Distributed state estimation is of paramount importance in many applications involving the large-scale complex systems over spatially deployed networked sensors.This paper provides an overview for analysis of distribu...Distributed state estimation is of paramount importance in many applications involving the large-scale complex systems over spatially deployed networked sensors.This paper provides an overview for analysis of distributed state estimation algorithms for linear time invariant systems.A number of previous works are reviewed and a clear classification of the main approaches in this field are presented,i.e.,Kalman-filter-type methods and Luenberger-observer-type methods.The design and the stability analysis of these methods are discussed.Moreover,a comprehensive comparison of the existing results is provided in terms of some standard metrics including the graph connectivity,system observability,optimality,time scale and so on.Finally,several important and challenging future research directions are discussed.展开更多
The dynamic event-triggered(DET)formation control problem of a class of stochastic nonlinear multi-agent systems(MASs)with full state constraints is investigated in this article.Supposing that the human operator sends...The dynamic event-triggered(DET)formation control problem of a class of stochastic nonlinear multi-agent systems(MASs)with full state constraints is investigated in this article.Supposing that the human operator sends commands to the leader as control input signals,all followers keep formation through network topology communication.Under the command-filter-based backstepping technique,the radial basis function neural networks(RBF NNs)and the barrier Lyapunov function(BLF)are utilized to resolve the problems of unknown nonlinear terms and full state constraints,respectively.Furthermore,a DET control mechanism is proposed to reduce the occupation of communication bandwidth.The presented distributed formation control strategy guarantees that all signals of the MASs are semi-globally uniformly ultimately bounded(SGUUB)in probability.Finally,the feasibility of the theoretical research result is demonstrated by a simulation example.展开更多
To investigate the control of morphing wings by means of interacting effectors,this article proposes a distributed coordinated control scheme with sampled communication on the basis of a simple morphing wing model,est...To investigate the control of morphing wings by means of interacting effectors,this article proposes a distributed coordinated control scheme with sampled communication on the basis of a simple morphing wing model,established with arrayed agents. The control scheme can change the shape of airfoil into an expected one and keep it smooth during morphing. As the interconnection of communication network and the agents would make the behavior of the morphing wing system complicated,a diagrammatic stability analysis method is put forward to ensure the system stability. Two simulations are carried out on the morphing wing system by using MATLAB. The results stand witness to the feasibility of the distributed coordinated control scheme and the effectiveness of the diagrammatic stability analysis method.展开更多
This study analyzes the cooperative coalition problem for formation scheduling based on incomplete information. A multi-agent cooperative coalition framework is developed to optimize the formation scheduling problem i...This study analyzes the cooperative coalition problem for formation scheduling based on incomplete information. A multi-agent cooperative coalition framework is developed to optimize the formation scheduling problem in a decentralized manner. The social class differentiation mech- anism and role-assuming mechanism are incorporated into the framework, which, in turn, ensures that the multi-agent system (MAS) evolves in the optimal direction. Moreover, a further differen- tiation pressure can be achieved to help MAS escape from local optima. A Bayesian coalition nego- tiation algorithm is constructed, within which the Harsanyi transformation is introduced to transform the coalition problem based on incomplete information to the Bayesian-equivalent coali- tion problem based on imperfect information. The simulation results suggest that the distribution of agents' expectations of other agents' unknown information approximates to the true distribution after a finite set of generations. The comparisons indicate that the MAS cooperative coalition algo- rithm produces a significantly better utility and possesses a more effective capability of escaping from local optima than the proposal-engaged marriage algorithm and the Simulated Annealing algorithm.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:61872171The Belt and Road Special Foundation of the State Key Laboratory of Hydrology‐Water Resources and Hydraulic Engineering,Grant/Award Number:2021490811。
文摘Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.
基金co-supported by the Equipment Advance Research Project,China(No.50912020401)the Chinese Government Scholarship(No.201906830037)。
文摘This paper presents a discrete-time attitude control strategy with equi-global practical stabilizability for aligning the attitude of multiple spacecraft to a predesigned configuration according to a time-variant reference.By utilizing the interference of the wireless channel,the communication scheme designed in this paper can save communication resources,amount of computation,and energy proportionally to the number of spacecraft.The exact discrete-time model and approximate discrete-time model of the consensus-based spacecraft tracking system are given.Then the framework for the design of an event-triggered control scheme for the exact discrete-time system via its approximate models is developed,which avoids the periodic actuation,and Zeno behavior is proved to be excluded.Furthermore,the control scheme can handle the presence of the unknown fading channel.Finally,simulation results are presented to demonstrate the effectiveness of the control strategy.
基金supported by the National Natural Science Foundation of China (No. 61790573)supported by the National Natural Science Foundation of China (Nos. 61890924, 61991404)Liao Ning Revitalization Talents Program (No. XLYC1907087)
文摘Distributed state estimation is of paramount importance in many applications involving the large-scale complex systems over spatially deployed networked sensors.This paper provides an overview for analysis of distributed state estimation algorithms for linear time invariant systems.A number of previous works are reviewed and a clear classification of the main approaches in this field are presented,i.e.,Kalman-filter-type methods and Luenberger-observer-type methods.The design and the stability analysis of these methods are discussed.Moreover,a comprehensive comparison of the existing results is provided in terms of some standard metrics including the graph connectivity,system observability,optimality,time scale and so on.Finally,several important and challenging future research directions are discussed.
基金supported in part by the National Natural Science Foundation of China(62121004,62033003,61973091,62203119)the Local Innovative and Research Teams Project of Guangdong Special Support Program(2019BT02X353)+1 种基金the Natural Science Foundation of Guangdong Province(2023A1515011527,2022A1515011506)the China National Postdoctoral Program(BX20220095,2022M710826).
文摘The dynamic event-triggered(DET)formation control problem of a class of stochastic nonlinear multi-agent systems(MASs)with full state constraints is investigated in this article.Supposing that the human operator sends commands to the leader as control input signals,all followers keep formation through network topology communication.Under the command-filter-based backstepping technique,the radial basis function neural networks(RBF NNs)and the barrier Lyapunov function(BLF)are utilized to resolve the problems of unknown nonlinear terms and full state constraints,respectively.Furthermore,a DET control mechanism is proposed to reduce the occupation of communication bandwidth.The presented distributed formation control strategy guarantees that all signals of the MASs are semi-globally uniformly ultimately bounded(SGUUB)in probability.Finally,the feasibility of the theoretical research result is demonstrated by a simulation example.
基金National Natural Science Foundation of China (90605007)
文摘To investigate the control of morphing wings by means of interacting effectors,this article proposes a distributed coordinated control scheme with sampled communication on the basis of a simple morphing wing model,established with arrayed agents. The control scheme can change the shape of airfoil into an expected one and keep it smooth during morphing. As the interconnection of communication network and the agents would make the behavior of the morphing wing system complicated,a diagrammatic stability analysis method is put forward to ensure the system stability. Two simulations are carried out on the morphing wing system by using MATLAB. The results stand witness to the feasibility of the distributed coordinated control scheme and the effectiveness of the diagrammatic stability analysis method.
基金supported by the National Natural Science Foundation of China(No.61039001)the National Science and Technology Support Program of China(No.2011BAH24B10)
文摘This study analyzes the cooperative coalition problem for formation scheduling based on incomplete information. A multi-agent cooperative coalition framework is developed to optimize the formation scheduling problem in a decentralized manner. The social class differentiation mech- anism and role-assuming mechanism are incorporated into the framework, which, in turn, ensures that the multi-agent system (MAS) evolves in the optimal direction. Moreover, a further differen- tiation pressure can be achieved to help MAS escape from local optima. A Bayesian coalition nego- tiation algorithm is constructed, within which the Harsanyi transformation is introduced to transform the coalition problem based on incomplete information to the Bayesian-equivalent coali- tion problem based on imperfect information. The simulation results suggest that the distribution of agents' expectations of other agents' unknown information approximates to the true distribution after a finite set of generations. The comparisons indicate that the MAS cooperative coalition algo- rithm produces a significantly better utility and possesses a more effective capability of escaping from local optima than the proposal-engaged marriage algorithm and the Simulated Annealing algorithm.