This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight...This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.展开更多
The emergence of beyond 5G networks has the potential for seamless and intelligent connectivity on a global scale.Network slicing is crucial in delivering services for different,demanding vertical applications in this...The emergence of beyond 5G networks has the potential for seamless and intelligent connectivity on a global scale.Network slicing is crucial in delivering services for different,demanding vertical applications in this context.Next-generation applications have time-sensitive requirements and depend on the most efficient routing path to ensure packets reach their intended destinations.However,the existing IP(Internet Protocol)over a multi-domain network faces challenges in enforcing network slicing due to minimal collaboration and information sharing among network operators.Conventional inter-domain routing methods,like Border Gateway Protocol(BGP),cannot make routing decisions based on performance,which frequently results in traffic flowing across congested paths that are never optimal.To address these issues,we propose CoopAI-Route,a multi-agent cooperative deep reinforcement learning(DRL)system utilizing hierarchical software-defined networks(SDN).This framework enforces network slicing in multi-domain networks and cooperative communication with various administrators to find performance-based routes in intra-and inter-domain.CoopAI-Route employs the Distributed Global Topology(DGT)algorithm to define inter-domain Quality of Service(QoS)paths.CoopAI-Route uses a DRL agent with a message-passing multi-agent Twin-Delayed Deep Deterministic Policy Gradient method to ensure optimal end-to-end routes adapted to the specific requirements of network slicing applications.Our evaluation demonstrates CoopAI-Route’s commendable performance in scalability,link failure handling,and adaptability to evolving topologies compared to state-of-the-art methods.展开更多
Aim To design and implement a multi-agent cooperative problem solving expert system tool. Methods A blackboard system was adopted in the system as a data sharing and information exchanging center, to coordinate the co...Aim To design and implement a multi-agent cooperative problem solving expert system tool. Methods A blackboard system was adopted in the system as a data sharing and information exchanging center, to coordinate the complex cooperative problem solving. The system was developed in UNIX and MSWindows 95 mixed TCP/IP network environment. Results and Conclusion A prototype system of a multi-agent cooperative expert systems tool is implemented.The experiment demonstrates that the fundamental functions of a cooperative expert systems is realized.展开更多
This paper investigates the robust cooperative output regulation problem for a class of heterogeneousuncertain linear multi-agent systems with an unknown exosystem via event-triggered control (ETC). By utilizingthe in...This paper investigates the robust cooperative output regulation problem for a class of heterogeneousuncertain linear multi-agent systems with an unknown exosystem via event-triggered control (ETC). By utilizingthe internal model approach and the adaptive control technique, a distributed adaptive internal model isconstructed for each agent. Then, based on this internal model, a fully distributed ETC strategy composed ofa distributed event-triggered adaptive output feedback control law and a distributed dynamic event-triggeringmechanism is proposed, in which each agent updates its control input at its own triggering time instants. It isshown that under the proposed ETC strategy, the robust cooperative output regulation problem can be solvedwithout requiring either the global information associated with the communication topology or the bounds ofthe uncertain or unknown parameters in each agent and the exosystem. A numerical example is provided toillustrate the effectiveness of the proposed control strategy.展开更多
The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked age...The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked agents. A theorem in the form of linear matrix inequalities(LMI) is derived to analyze the system stability. An- other theorem in the form of optimization problem subject to LMI constraints is proposed to design the controller, and then the algorithm is presented. The simulation results verify the validity and the effectiveness of the pro- posed approach.展开更多
With the new characteristics of global cooperation in supply chains being synthetically considered,a hybrid model to the cooperative negotiation process for the order distribution in supply chain is mainly studied.Aft...With the new characteristics of global cooperation in supply chains being synthetically considered,a hybrid model to the cooperative negotiation process for the order distribution in supply chain is mainly studied.After reviewing and analyzing some main domestic and overseas processes in cooperative negotiation modeling in supply chain,some problems are subsequently pointed out.For example,the traditional simple multi-agent system(MAS)frameworks which have some limitations,are not suitable for solving modeling complex systems.To solve these problems,thinking with the aid of the multi-agent structure and complex system modeling,the manufacturing supply chain is taken as an example,and a time Petri net production model is adopted to decompose the materials.And then a cooperative negotiation model for the order distribution in supply chain is constructed based on combining multi-agent techniques with time Petri net modeling.The simulation results reveal that the above model helps solve the problems of cooperative negotiation in supply chains.展开更多
Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune s...Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.展开更多
This paper addresses the cooperative control problem of multiple unmanned aerial vehicles(multi-UAV)systems.First,a new distributed consensus algorithm for second-order nonlinear multi-agent systems(MAS)is formulated ...This paper addresses the cooperative control problem of multiple unmanned aerial vehicles(multi-UAV)systems.First,a new distributed consensus algorithm for second-order nonlinear multi-agent systems(MAS)is formulated under the leader-following approach.The algorithm provides smooth input signals to the agents’control channels,which avoids the chattering effect generated by the conventional sliding mode-based control protocols.Second,a new formation control scheme is developed by integrating smooth distributed consensus control protocols into the geometric pattern model to achieve three-dimensional formation tracking.The Lyapunov theory is used to prove the stability and convergence of both distributed consensus and formation controllers.The effectiveness of the proposed algorithms is demonstrated through simulation results.展开更多
In this paper, the leader-following tracking problem of fractional-order multi-agent systems is addressed. The dynamics of each agent may be heterogeneous and has unknown nonlinearities. By assumptions that the intera...In this paper, the leader-following tracking problem of fractional-order multi-agent systems is addressed. The dynamics of each agent may be heterogeneous and has unknown nonlinearities. By assumptions that the interaction topology is undirected and connected and the unknown nonlinear uncertain dynamics can be parameterized by a neural network, an adaptive learning law is proposed to deal with unknown nonlinear dynamics, based on which a kind of cooperative tracking protocols are constructed. The feedback gain matrix is obtained to solve an algebraic Riccati equation. To construct the fully distributed cooperative tracking protocols, the adaptive law is also adopted to adjust the coupling weight. With the developed control laws,we can prove that all signals in the closed-loop systems are guaranteed to be uniformly ultimately bounded. Finally, a simple simulation example is provided to illustrate the established result.展开更多
Multi-agent systems can solve scientific issues related to complex systems that are difficult or impossible for a single agent to solve through mutual collaboration and cooperation optimization.In a multi-agent system...Multi-agent systems can solve scientific issues related to complex systems that are difficult or impossible for a single agent to solve through mutual collaboration and cooperation optimization.In a multi-agent system,agents with a certain degree of autonomy generate complex interactions due to the correlation and coordination,which is manifested as cooperative/competitive behavior.This survey focuses on multi-agent cooperative optimization and cooperative/non-cooperative games.Starting from cooperative optimization,the studies on distributed optimization and federated optimization are summarized.The survey mainly focuses on distributed online optimization and its application in privacy protection,and overviews federated optimization from the perspective of privacy protection me-chanisms.Then,cooperative games and non-cooperative games are introduced to expand the cooperative optimization problems from two aspects of minimizing global costs and minimizing individual costs,respectively.Multi-agent cooperative and non-cooperative behaviors are modeled by games from both static and dynamic aspects,according to whether each player can make decisions based on the information of other players.Finally,future directions for cooperative optimization,cooperative/non-cooperative games,and their applications are discussed.展开更多
In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried ou...In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation.展开更多
Cooperative multi-agent reinforcement learning( MARL) is an important topic in the field of artificial intelligence,in which distributed constraint optimization( DCOP) algorithms have been widely used to coordinat...Cooperative multi-agent reinforcement learning( MARL) is an important topic in the field of artificial intelligence,in which distributed constraint optimization( DCOP) algorithms have been widely used to coordinate the actions of multiple agents. However,dense communication among agents affects the practicability of DCOP algorithms. In this paper,we propose a novel DCOP algorithm dealing with the previous DCOP algorithms' communication problem by reducing constraints.The contributions of this paper are primarily threefold:(1) It is proved that removing constraints can effectively reduce the communication burden of DCOP algorithms.(2) An criterion is provided to identify insignificant constraints whose elimination doesn't have a great impact on the performance of the whole system.(3) A constraint-reduced DCOP algorithm is proposed by adopting a variant of spectral clustering algorithm to detect and eliminate the insignificant constraints. Our algorithm reduces the communication burdern of the benchmark DCOP algorithm while keeping its overall performance unaffected. The performance of constraint-reduced DCOP algorithm is evaluated on four configurations of cooperative sensor networks. The effectiveness of communication reduction is also verified by comparisons between the constraint-reduced DCOP and the benchmark DCOP.展开更多
Recent demand for wireless communication continues to grow rapidly as a result of the increasing number of users, the emergence of new user requirements, and the trend to new access technologies. At the same time, the...Recent demand for wireless communication continues to grow rapidly as a result of the increasing number of users, the emergence of new user requirements, and the trend to new access technologies. At the same time, the electromagnetic spectrum or frequencies allocated for this purpose are still limited. This makes solving the frequency assignment problem more and more critical. In this paper, a new approach is proposed using self-organizing multi-agent systems to solve distributed dynamic channel-assignment;it concerns distribution among agents which task is to assign personal station to frequencies with respect to well known constraints. Agents only know their variables and the constraints affecting them, and have to negotiate to find a collective solution. The approach is based on a macro-level management taking the form of a hierarchical group of distributed agents in the network and handling all RANs (Regional Radio Access Network) in a localized region regardless of the operating band. The approach defines cooperative self-organization as the process leading the collective to the solution: agents can change the organization by their own decision to improve the state of the system. Our approach has been tested on PHEADEPHIA benchmarks of frequency assignment Problem. The results obtained are equivalent to those of current existing methods with the benefits that our approach shows more efficiency in terms of flexibility and autonomy.展开更多
Among the promising application of autonomous surface vessels(ASVs)is the utilization of multiple autonomous tugs for manipulating a floating object such as an oil platform,a broken ship,or a ship in port areas.Consid...Among the promising application of autonomous surface vessels(ASVs)is the utilization of multiple autonomous tugs for manipulating a floating object such as an oil platform,a broken ship,or a ship in port areas.Considering the real conditions and operations of maritime practice,this paper proposes a multi-agent control algorithm to manipulate a ship to a desired position with a desired heading and velocity under the environmental disturbances.The control architecture consists of a supervisory controller in the higher layer and tug controllers in the lower layer.The supervisory controller allocates the towing forces and angles between the tugs and the ship by minimizing the error in the position and velocity of the ship.The weight coefficients in the cost function are designed to be adaptive to guarantee that the towing system functions well under environmental disturbances,and to enhance the efficiency of the towing system.The tug controller provides the forces to tow the ship and tracks the reference trajectory that is computed online based on the towing angles calculated by the supervisory controller.Simulation results show that the proposed algorithm can make the two autonomous tugs cooperatively tow a ship to a desired position with a desired heading and velocity under the(even harsh)environmental disturbances.展开更多
Rapid and effective maritime search and rescue operations become the important guarantee for the safety of maritime navigation.The existing maritime search and rescue networking and model have slow response speed and ...Rapid and effective maritime search and rescue operations become the important guarantee for the safety of maritime navigation.The existing maritime search and rescue networking and model have slow response speed and low efficiency.The distribution,synergy,parallelism,robustness and intelligence of unmanned surface vehicle(USV)and unmanned aerial vehicle(UAV)provide a new idea for the novel maritime search and rescue networking,in which multi-agent could be used to build a layered control network.In this paper,a novel rapid search and rescue system is proposed by utilizing the improved ant colony optimization and the independent calculation decision of the agents.The system adopts the edge computing,relies on the information sharing and the cooperative decision between the search and rescue agent groups.It achieves the independent synchronous search and rescue.At the same time,we use particle swarm optimization to intelligently schedule data packets during the rescue process to optimize network forwarding performance.Based on the distributed cluster control of USV and UAV,this paper combines edge computing,cooperative communication and centralized task allocation together to make decision for rescue.The simulation results show that our proposed schemes realize a significant improvement for maritime search and rescue.展开更多
To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit...To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.展开更多
The cooperative output tracking problem of multi-agent systems in finite time is considered.In order to enable the agents to quickly track and converge to external system within a finite time,a novel distributed outpu...The cooperative output tracking problem of multi-agent systems in finite time is considered.In order to enable the agents to quickly track and converge to external system within a finite time,a novel distributed output feedback control strategy based on the finite-time state observer is designed.This distributed finite-time observer can not only solve cooperative output tracking problems when the agents can not get external system signal,but also make the systems have a faster convergence and a good robustness.The stability of the system in finite time is proved based on Lyapunov function.Numerical simulations results have been provided to demonstrate the effectiveness of the proposed protocol.展开更多
Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduli...Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm.展开更多
基金supported by the National Science and Technology Major Project (2021ZD0112702)the National Natural Science Foundation (NNSF)of China (62373100,62233003)the Natural Science Foundation of Jiangsu Province of China (BK20202006)。
文摘This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.
文摘The emergence of beyond 5G networks has the potential for seamless and intelligent connectivity on a global scale.Network slicing is crucial in delivering services for different,demanding vertical applications in this context.Next-generation applications have time-sensitive requirements and depend on the most efficient routing path to ensure packets reach their intended destinations.However,the existing IP(Internet Protocol)over a multi-domain network faces challenges in enforcing network slicing due to minimal collaboration and information sharing among network operators.Conventional inter-domain routing methods,like Border Gateway Protocol(BGP),cannot make routing decisions based on performance,which frequently results in traffic flowing across congested paths that are never optimal.To address these issues,we propose CoopAI-Route,a multi-agent cooperative deep reinforcement learning(DRL)system utilizing hierarchical software-defined networks(SDN).This framework enforces network slicing in multi-domain networks and cooperative communication with various administrators to find performance-based routes in intra-and inter-domain.CoopAI-Route employs the Distributed Global Topology(DGT)algorithm to define inter-domain Quality of Service(QoS)paths.CoopAI-Route uses a DRL agent with a message-passing multi-agent Twin-Delayed Deep Deterministic Policy Gradient method to ensure optimal end-to-end routes adapted to the specific requirements of network slicing applications.Our evaluation demonstrates CoopAI-Route’s commendable performance in scalability,link failure handling,and adaptability to evolving topologies compared to state-of-the-art methods.
文摘Aim To design and implement a multi-agent cooperative problem solving expert system tool. Methods A blackboard system was adopted in the system as a data sharing and information exchanging center, to coordinate the complex cooperative problem solving. The system was developed in UNIX and MSWindows 95 mixed TCP/IP network environment. Results and Conclusion A prototype system of a multi-agent cooperative expert systems tool is implemented.The experiment demonstrates that the fundamental functions of a cooperative expert systems is realized.
基金the National Natural Science Foundation of China(NSFC)-Excellent Young Scientists Fund(Hong Kong and Macao)under Grant 62222318.
文摘This paper investigates the robust cooperative output regulation problem for a class of heterogeneousuncertain linear multi-agent systems with an unknown exosystem via event-triggered control (ETC). By utilizingthe internal model approach and the adaptive control technique, a distributed adaptive internal model isconstructed for each agent. Then, based on this internal model, a fully distributed ETC strategy composed ofa distributed event-triggered adaptive output feedback control law and a distributed dynamic event-triggeringmechanism is proposed, in which each agent updates its control input at its own triggering time instants. It isshown that under the proposed ETC strategy, the robust cooperative output regulation problem can be solvedwithout requiring either the global information associated with the communication topology or the bounds ofthe uncertain or unknown parameters in each agent and the exosystem. A numerical example is provided toillustrate the effectiveness of the proposed control strategy.
基金Supported by the National Natural Science Foundation of China(91016017)the National Aviation Found of China(20115868009)~~
文摘The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked agents. A theorem in the form of linear matrix inequalities(LMI) is derived to analyze the system stability. An- other theorem in the form of optimization problem subject to LMI constraints is proposed to design the controller, and then the algorithm is presented. The simulation results verify the validity and the effectiveness of the pro- posed approach.
基金The National Natural Science Foundation of China(No.70401013)the National Key Technology R&D Program of China during the 11th Five-Year Plan Period(No.2006BAH02A06)
文摘With the new characteristics of global cooperation in supply chains being synthetically considered,a hybrid model to the cooperative negotiation process for the order distribution in supply chain is mainly studied.After reviewing and analyzing some main domestic and overseas processes in cooperative negotiation modeling in supply chain,some problems are subsequently pointed out.For example,the traditional simple multi-agent system(MAS)frameworks which have some limitations,are not suitable for solving modeling complex systems.To solve these problems,thinking with the aid of the multi-agent structure and complex system modeling,the manufacturing supply chain is taken as an example,and a time Petri net production model is adopted to decompose the materials.And then a cooperative negotiation model for the order distribution in supply chain is constructed based on combining multi-agent techniques with time Petri net modeling.The simulation results reveal that the above model helps solve the problems of cooperative negotiation in supply chains.
文摘Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.
基金This work was supported by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah(G-363-135-1438).
文摘This paper addresses the cooperative control problem of multiple unmanned aerial vehicles(multi-UAV)systems.First,a new distributed consensus algorithm for second-order nonlinear multi-agent systems(MAS)is formulated under the leader-following approach.The algorithm provides smooth input signals to the agents’control channels,which avoids the chattering effect generated by the conventional sliding mode-based control protocols.Second,a new formation control scheme is developed by integrating smooth distributed consensus control protocols into the geometric pattern model to achieve three-dimensional formation tracking.The Lyapunov theory is used to prove the stability and convergence of both distributed consensus and formation controllers.The effectiveness of the proposed algorithms is demonstrated through simulation results.
基金supported by the National Natural Science Foundation of China(61303211)Zhejiang Provincial Natural Science Foundation of China(LY17F030003,LY15F030009)
文摘In this paper, the leader-following tracking problem of fractional-order multi-agent systems is addressed. The dynamics of each agent may be heterogeneous and has unknown nonlinearities. By assumptions that the interaction topology is undirected and connected and the unknown nonlinear uncertain dynamics can be parameterized by a neural network, an adaptive learning law is proposed to deal with unknown nonlinear dynamics, based on which a kind of cooperative tracking protocols are constructed. The feedback gain matrix is obtained to solve an algebraic Riccati equation. To construct the fully distributed cooperative tracking protocols, the adaptive law is also adopted to adjust the coupling weight. With the developed control laws,we can prove that all signals in the closed-loop systems are guaranteed to be uniformly ultimately bounded. Finally, a simple simulation example is provided to illustrate the established result.
基金supported in part by the National Natural Science Foundation of China(Basic Science Center Program:61988101)the Sino-German Center for Research Promotion(M-0066)+2 种基金the International(Regional)Cooperation and Exchange Project(61720106008)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)(B17017)the Program of Shanghai Academic Research Leader(20XD1401300).
文摘Multi-agent systems can solve scientific issues related to complex systems that are difficult or impossible for a single agent to solve through mutual collaboration and cooperation optimization.In a multi-agent system,agents with a certain degree of autonomy generate complex interactions due to the correlation and coordination,which is manifested as cooperative/competitive behavior.This survey focuses on multi-agent cooperative optimization and cooperative/non-cooperative games.Starting from cooperative optimization,the studies on distributed optimization and federated optimization are summarized.The survey mainly focuses on distributed online optimization and its application in privacy protection,and overviews federated optimization from the perspective of privacy protection me-chanisms.Then,cooperative games and non-cooperative games are introduced to expand the cooperative optimization problems from two aspects of minimizing global costs and minimizing individual costs,respectively.Multi-agent cooperative and non-cooperative behaviors are modeled by games from both static and dynamic aspects,according to whether each player can make decisions based on the information of other players.Finally,future directions for cooperative optimization,cooperative/non-cooperative games,and their applications are discussed.
基金supported by the Aeronautical Science Foundation of China(2017ZC53033)the Seed Foundation of Innovation and Creation for Graduate Students in Northwestern Polytechnical University(CX2020156)。
文摘In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation.
基金Supported by the National Social Science Foundation of China(15ZDA034,14BZZ028)Beijing Social Science Foundation(16JDGLA036)JKF Program of People’s Public Security University of China(2016JKF01318)
文摘Cooperative multi-agent reinforcement learning( MARL) is an important topic in the field of artificial intelligence,in which distributed constraint optimization( DCOP) algorithms have been widely used to coordinate the actions of multiple agents. However,dense communication among agents affects the practicability of DCOP algorithms. In this paper,we propose a novel DCOP algorithm dealing with the previous DCOP algorithms' communication problem by reducing constraints.The contributions of this paper are primarily threefold:(1) It is proved that removing constraints can effectively reduce the communication burden of DCOP algorithms.(2) An criterion is provided to identify insignificant constraints whose elimination doesn't have a great impact on the performance of the whole system.(3) A constraint-reduced DCOP algorithm is proposed by adopting a variant of spectral clustering algorithm to detect and eliminate the insignificant constraints. Our algorithm reduces the communication burdern of the benchmark DCOP algorithm while keeping its overall performance unaffected. The performance of constraint-reduced DCOP algorithm is evaluated on four configurations of cooperative sensor networks. The effectiveness of communication reduction is also verified by comparisons between the constraint-reduced DCOP and the benchmark DCOP.
文摘Recent demand for wireless communication continues to grow rapidly as a result of the increasing number of users, the emergence of new user requirements, and the trend to new access technologies. At the same time, the electromagnetic spectrum or frequencies allocated for this purpose are still limited. This makes solving the frequency assignment problem more and more critical. In this paper, a new approach is proposed using self-organizing multi-agent systems to solve distributed dynamic channel-assignment;it concerns distribution among agents which task is to assign personal station to frequencies with respect to well known constraints. Agents only know their variables and the constraints affecting them, and have to negotiate to find a collective solution. The approach is based on a macro-level management taking the form of a hierarchical group of distributed agents in the network and handling all RANs (Regional Radio Access Network) in a localized region regardless of the operating band. The approach defines cooperative self-organization as the process leading the collective to the solution: agents can change the organization by their own decision to improve the state of the system. Our approach has been tested on PHEADEPHIA benchmarks of frequency assignment Problem. The results obtained are equivalent to those of current existing methods with the benefits that our approach shows more efficiency in terms of flexibility and autonomy.
基金supported by the China Scholarship Council(201806950080)the Researchlab Autonomous Shipping(RAS)of Delft University of Technology,and the INTERREG North Sea Region Grant“AVATAR”funded by the European Regional Development Fund.
文摘Among the promising application of autonomous surface vessels(ASVs)is the utilization of multiple autonomous tugs for manipulating a floating object such as an oil platform,a broken ship,or a ship in port areas.Considering the real conditions and operations of maritime practice,this paper proposes a multi-agent control algorithm to manipulate a ship to a desired position with a desired heading and velocity under the environmental disturbances.The control architecture consists of a supervisory controller in the higher layer and tug controllers in the lower layer.The supervisory controller allocates the towing forces and angles between the tugs and the ship by minimizing the error in the position and velocity of the ship.The weight coefficients in the cost function are designed to be adaptive to guarantee that the towing system functions well under environmental disturbances,and to enhance the efficiency of the towing system.The tug controller provides the forces to tow the ship and tracks the reference trajectory that is computed online based on the towing angles calculated by the supervisory controller.Simulation results show that the proposed algorithm can make the two autonomous tugs cooperatively tow a ship to a desired position with a desired heading and velocity under the(even harsh)environmental disturbances.
基金supported in part by Natural Science Foundation of China under Grant 61771086China Postdoctoral Science Foundation under Grant 2015T80238Dalian Outstanding Young Science and Technology Talents Foundation.And it was partly published on IEEE Green Computing 2018.
文摘Rapid and effective maritime search and rescue operations become the important guarantee for the safety of maritime navigation.The existing maritime search and rescue networking and model have slow response speed and low efficiency.The distribution,synergy,parallelism,robustness and intelligence of unmanned surface vehicle(USV)and unmanned aerial vehicle(UAV)provide a new idea for the novel maritime search and rescue networking,in which multi-agent could be used to build a layered control network.In this paper,a novel rapid search and rescue system is proposed by utilizing the improved ant colony optimization and the independent calculation decision of the agents.The system adopts the edge computing,relies on the information sharing and the cooperative decision between the search and rescue agent groups.It achieves the independent synchronous search and rescue.At the same time,we use particle swarm optimization to intelligently schedule data packets during the rescue process to optimize network forwarding performance.Based on the distributed cluster control of USV and UAV,this paper combines edge computing,cooperative communication and centralized task allocation together to make decision for rescue.The simulation results show that our proposed schemes realize a significant improvement for maritime search and rescue.
基金financial support from National Natural Science Foundation of China(Grant No.61601491)Natural Science Foundation of Hubei Province,China(Grant No.2018CFC865)Military Research Project of China(-Grant No.YJ2020B117)。
文摘To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.
基金National Natural Science Foundation of China(No.61663020)National Key R&D Program of China(No.2017YFB1201003-020)Natural Science Foundation of Gansu Province(No.17JR5RA096)
文摘The cooperative output tracking problem of multi-agent systems in finite time is considered.In order to enable the agents to quickly track and converge to external system within a finite time,a novel distributed output feedback control strategy based on the finite-time state observer is designed.This distributed finite-time observer can not only solve cooperative output tracking problems when the agents can not get external system signal,but also make the systems have a faster convergence and a good robustness.The stability of the system in finite time is proved based on Lyapunov function.Numerical simulations results have been provided to demonstrate the effectiveness of the proposed protocol.
文摘Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm.