Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experi...Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.展开更多
It is impossible to plan in advance unpredictable phenomena at monitoring, diagnosis and control of industrial batch and continuous equipment and processes such as chemical composition of the raw materials, the proces...It is impossible to plan in advance unpredictable phenomena at monitoring, diagnosis and control of industrial batch and continuous equipment and processes such as chemical composition of the raw materials, the process leads to unexpected reactions and changes its parameters, etc. The agent is active, a program entity, has its own ideas how to perform the tasks of the own agenda. Agents perceive, behave "reasonably", communicate with other agents. Agents can represent equipment and operations in batch processes as recommended by the ISA $88. Jadex system is based on Java language and on FIPA org. recommendations. The description of ripening tank T406 and recipe for yogurt production in the holding of MADETA Corp. in the Czech Rep. It is described modeling and displaying of"normal" and error, fault unit state of the ripening tank. Agents are within the Jadex system and describing the behavior of ripening tank T406 with state diagrams-automata and assist in diagnosing of fault states. States are described in XML language-SCXML (State Charts XML). Jadex Control Center-JCC represents a major access point to use for operating in real time.展开更多
Coordinating all the activities among all the parties involved in supply chain can be a daunting task. This paper put forth the viewpoint of applying agent technology to automate the coordination and decision-making t...Coordinating all the activities among all the parties involved in supply chain can be a daunting task. This paper put forth the viewpoint of applying agent technology to automate the coordination and decision-making tasks in a typical home PC industry supply chain. The main features of the proposed approach, which differentiate it from other approaches, are the following:① In the prototype, the coordination agents have both cooperation and competition patterns; ② It uses JADE (Java Agent DEvelopment Framework) as the agent development environment to realize efficient and reusable agent software;③ It produces some innovative models for the business processes and issues faced by parties in the supply chain. A prototype and the overall process flow were also described.展开更多
Today's production systems are demanded to exhibit an increased flexibility and mutability in order to deal with dynamically changing conditions, objectives and an increasing number of product variants within industr...Today's production systems are demanded to exhibit an increased flexibility and mutability in order to deal with dynamically changing conditions, objectives and an increasing number of product variants within industrial turbulent environments. Flexible automated systems are requested in order to improve dynamic production efficiency, e.g. robot-based hardware and PC-based controllers, but these usually induce a significantly higher production complexity, whereby the efforts for planning and programming, but also setups and reconfiguration, expand. In this paper a definition and some concepts of self-optimizing assembly systems are presented to describe possible ways to reduce the planning efforts in complex production systems. The concept of self-optimization in assembly systems will be derived from a theoretical approach and will be transferred to a specific application scenario---the automated assembly of a miniaturized solid state laser--where the challenges of unpredictable influences from e.g. component tolerances can be overcome by the help of self-optimization.展开更多
This paper investigates the distributed finite-time consensus tracking problem for higher- order nonlinear multi-agent systems (MASs). The distributed finite-time consensus protocol is based on full order sliding su...This paper investigates the distributed finite-time consensus tracking problem for higher- order nonlinear multi-agent systems (MASs). The distributed finite-time consensus protocol is based on full order sliding surface and super twisting algorithm. The nominal consensus control for the MASs is designed based on the geometric homogeneous finite time control technique. The chattering is avoided by designing a full order sliding surface. The switching control is constructed by integrating super twisting algorithm, hence a chattering alleviation protocol is obtained to maintain a smooth control input. The finite time convergence analysis for the leader follower network is presented by using strict Lyapunov function. Finally, the numerical simulations validate the proposed homogeneous full-order sliding mode control for higher-order MASs.展开更多
This paper presents a discrete-time attitude control strategy with equi-global practical stabilizability for aligning the attitude of multiple spacecraft to a predesigned configuration according to a time-variant refe...This paper presents a discrete-time attitude control strategy with equi-global practical stabilizability for aligning the attitude of multiple spacecraft to a predesigned configuration according to a time-variant reference.By utilizing the interference of the wireless channel,the communication scheme designed in this paper can save communication resources,amount of computation,and energy proportionally to the number of spacecraft.The exact discrete-time model and approximate discrete-time model of the consensus-based spacecraft tracking system are given.Then the framework for the design of an event-triggered control scheme for the exact discrete-time system via its approximate models is developed,which avoids the periodic actuation,and Zeno behavior is proved to be excluded.Furthermore,the control scheme can handle the presence of the unknown fading channel.Finally,simulation results are presented to demonstrate the effectiveness of the control strategy.展开更多
Distributed state estimation is of paramount importance in many applications involving the large-scale complex systems over spatially deployed networked sensors.This paper provides an overview for analysis of distribu...Distributed state estimation is of paramount importance in many applications involving the large-scale complex systems over spatially deployed networked sensors.This paper provides an overview for analysis of distributed state estimation algorithms for linear time invariant systems.A number of previous works are reviewed and a clear classification of the main approaches in this field are presented,i.e.,Kalman-filter-type methods and Luenberger-observer-type methods.The design and the stability analysis of these methods are discussed.Moreover,a comprehensive comparison of the existing results is provided in terms of some standard metrics including the graph connectivity,system observability,optimality,time scale and so on.Finally,several important and challenging future research directions are discussed.展开更多
The dynamic event-triggered(DET)formation control problem of a class of stochastic nonlinear multi-agent systems(MASs)with full state constraints is investigated in this article.Supposing that the human operator sends...The dynamic event-triggered(DET)formation control problem of a class of stochastic nonlinear multi-agent systems(MASs)with full state constraints is investigated in this article.Supposing that the human operator sends commands to the leader as control input signals,all followers keep formation through network topology communication.Under the command-filter-based backstepping technique,the radial basis function neural networks(RBF NNs)and the barrier Lyapunov function(BLF)are utilized to resolve the problems of unknown nonlinear terms and full state constraints,respectively.Furthermore,a DET control mechanism is proposed to reduce the occupation of communication bandwidth.The presented distributed formation control strategy guarantees that all signals of the MASs are semi-globally uniformly ultimately bounded(SGUUB)in probability.Finally,the feasibility of the theoretical research result is demonstrated by a simulation example.展开更多
To investigate the control of morphing wings by means of interacting effectors,this article proposes a distributed coordinated control scheme with sampled communication on the basis of a simple morphing wing model,est...To investigate the control of morphing wings by means of interacting effectors,this article proposes a distributed coordinated control scheme with sampled communication on the basis of a simple morphing wing model,established with arrayed agents. The control scheme can change the shape of airfoil into an expected one and keep it smooth during morphing. As the interconnection of communication network and the agents would make the behavior of the morphing wing system complicated,a diagrammatic stability analysis method is put forward to ensure the system stability. Two simulations are carried out on the morphing wing system by using MATLAB. The results stand witness to the feasibility of the distributed coordinated control scheme and the effectiveness of the diagrammatic stability analysis method.展开更多
This study analyzes the cooperative coalition problem for formation scheduling based on incomplete information. A multi-agent cooperative coalition framework is developed to optimize the formation scheduling problem i...This study analyzes the cooperative coalition problem for formation scheduling based on incomplete information. A multi-agent cooperative coalition framework is developed to optimize the formation scheduling problem in a decentralized manner. The social class differentiation mech- anism and role-assuming mechanism are incorporated into the framework, which, in turn, ensures that the multi-agent system (MAS) evolves in the optimal direction. Moreover, a further differen- tiation pressure can be achieved to help MAS escape from local optima. A Bayesian coalition nego- tiation algorithm is constructed, within which the Harsanyi transformation is introduced to transform the coalition problem based on incomplete information to the Bayesian-equivalent coali- tion problem based on imperfect information. The simulation results suggest that the distribution of agents' expectations of other agents' unknown information approximates to the true distribution after a finite set of generations. The comparisons indicate that the MAS cooperative coalition algo- rithm produces a significantly better utility and possesses a more effective capability of escaping from local optima than the proposal-engaged marriage algorithm and the Simulated Annealing algorithm.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:61872171The Belt and Road Special Foundation of the State Key Laboratory of Hydrology‐Water Resources and Hydraulic Engineering,Grant/Award Number:2021490811。
文摘Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.
文摘It is impossible to plan in advance unpredictable phenomena at monitoring, diagnosis and control of industrial batch and continuous equipment and processes such as chemical composition of the raw materials, the process leads to unexpected reactions and changes its parameters, etc. The agent is active, a program entity, has its own ideas how to perform the tasks of the own agenda. Agents perceive, behave "reasonably", communicate with other agents. Agents can represent equipment and operations in batch processes as recommended by the ISA $88. Jadex system is based on Java language and on FIPA org. recommendations. The description of ripening tank T406 and recipe for yogurt production in the holding of MADETA Corp. in the Czech Rep. It is described modeling and displaying of"normal" and error, fault unit state of the ripening tank. Agents are within the Jadex system and describing the behavior of ripening tank T406 with state diagrams-automata and assist in diagnosing of fault states. States are described in XML language-SCXML (State Charts XML). Jadex Control Center-JCC represents a major access point to use for operating in real time.
基金supported by National Natural Science Foundation of China(61433004,61603085)the China Postdoctoral Science Foundation(2015M570253)the Fundamental Research Funds for the Central Universities(N150403004)
基金Qingmiao Foundation of Antai School ofManagement of SJTU
文摘Coordinating all the activities among all the parties involved in supply chain can be a daunting task. This paper put forth the viewpoint of applying agent technology to automate the coordination and decision-making tasks in a typical home PC industry supply chain. The main features of the proposed approach, which differentiate it from other approaches, are the following:① In the prototype, the coordination agents have both cooperation and competition patterns; ② It uses JADE (Java Agent DEvelopment Framework) as the agent development environment to realize efficient and reusable agent software;③ It produces some innovative models for the business processes and issues faced by parties in the supply chain. A prototype and the overall process flow were also described.
文摘Today's production systems are demanded to exhibit an increased flexibility and mutability in order to deal with dynamically changing conditions, objectives and an increasing number of product variants within industrial turbulent environments. Flexible automated systems are requested in order to improve dynamic production efficiency, e.g. robot-based hardware and PC-based controllers, but these usually induce a significantly higher production complexity, whereby the efforts for planning and programming, but also setups and reconfiguration, expand. In this paper a definition and some concepts of self-optimizing assembly systems are presented to describe possible ways to reduce the planning efforts in complex production systems. The concept of self-optimization in assembly systems will be derived from a theoretical approach and will be transferred to a specific application scenario---the automated assembly of a miniaturized solid state laser--where the challenges of unpredictable influences from e.g. component tolerances can be overcome by the help of self-optimization.
文摘This paper investigates the distributed finite-time consensus tracking problem for higher- order nonlinear multi-agent systems (MASs). The distributed finite-time consensus protocol is based on full order sliding surface and super twisting algorithm. The nominal consensus control for the MASs is designed based on the geometric homogeneous finite time control technique. The chattering is avoided by designing a full order sliding surface. The switching control is constructed by integrating super twisting algorithm, hence a chattering alleviation protocol is obtained to maintain a smooth control input. The finite time convergence analysis for the leader follower network is presented by using strict Lyapunov function. Finally, the numerical simulations validate the proposed homogeneous full-order sliding mode control for higher-order MASs.
基金co-supported by the Equipment Advance Research Project,China(No.50912020401)the Chinese Government Scholarship(No.201906830037)。
文摘This paper presents a discrete-time attitude control strategy with equi-global practical stabilizability for aligning the attitude of multiple spacecraft to a predesigned configuration according to a time-variant reference.By utilizing the interference of the wireless channel,the communication scheme designed in this paper can save communication resources,amount of computation,and energy proportionally to the number of spacecraft.The exact discrete-time model and approximate discrete-time model of the consensus-based spacecraft tracking system are given.Then the framework for the design of an event-triggered control scheme for the exact discrete-time system via its approximate models is developed,which avoids the periodic actuation,and Zeno behavior is proved to be excluded.Furthermore,the control scheme can handle the presence of the unknown fading channel.Finally,simulation results are presented to demonstrate the effectiveness of the control strategy.
基金supported by the National Natural Science Foundation of China (No. 61790573)supported by the National Natural Science Foundation of China (Nos. 61890924, 61991404)Liao Ning Revitalization Talents Program (No. XLYC1907087)
文摘Distributed state estimation is of paramount importance in many applications involving the large-scale complex systems over spatially deployed networked sensors.This paper provides an overview for analysis of distributed state estimation algorithms for linear time invariant systems.A number of previous works are reviewed and a clear classification of the main approaches in this field are presented,i.e.,Kalman-filter-type methods and Luenberger-observer-type methods.The design and the stability analysis of these methods are discussed.Moreover,a comprehensive comparison of the existing results is provided in terms of some standard metrics including the graph connectivity,system observability,optimality,time scale and so on.Finally,several important and challenging future research directions are discussed.
基金supported in part by the National Natural Science Foundation of China(62121004,62033003,61973091,62203119)the Local Innovative and Research Teams Project of Guangdong Special Support Program(2019BT02X353)+1 种基金the Natural Science Foundation of Guangdong Province(2023A1515011527,2022A1515011506)the China National Postdoctoral Program(BX20220095,2022M710826).
文摘The dynamic event-triggered(DET)formation control problem of a class of stochastic nonlinear multi-agent systems(MASs)with full state constraints is investigated in this article.Supposing that the human operator sends commands to the leader as control input signals,all followers keep formation through network topology communication.Under the command-filter-based backstepping technique,the radial basis function neural networks(RBF NNs)and the barrier Lyapunov function(BLF)are utilized to resolve the problems of unknown nonlinear terms and full state constraints,respectively.Furthermore,a DET control mechanism is proposed to reduce the occupation of communication bandwidth.The presented distributed formation control strategy guarantees that all signals of the MASs are semi-globally uniformly ultimately bounded(SGUUB)in probability.Finally,the feasibility of the theoretical research result is demonstrated by a simulation example.
基金National Natural Science Foundation of China (90605007)
文摘To investigate the control of morphing wings by means of interacting effectors,this article proposes a distributed coordinated control scheme with sampled communication on the basis of a simple morphing wing model,established with arrayed agents. The control scheme can change the shape of airfoil into an expected one and keep it smooth during morphing. As the interconnection of communication network and the agents would make the behavior of the morphing wing system complicated,a diagrammatic stability analysis method is put forward to ensure the system stability. Two simulations are carried out on the morphing wing system by using MATLAB. The results stand witness to the feasibility of the distributed coordinated control scheme and the effectiveness of the diagrammatic stability analysis method.
基金supported by the National Natural Science Foundation of China(No.61039001)the National Science and Technology Support Program of China(No.2011BAH24B10)
文摘This study analyzes the cooperative coalition problem for formation scheduling based on incomplete information. A multi-agent cooperative coalition framework is developed to optimize the formation scheduling problem in a decentralized manner. The social class differentiation mech- anism and role-assuming mechanism are incorporated into the framework, which, in turn, ensures that the multi-agent system (MAS) evolves in the optimal direction. Moreover, a further differen- tiation pressure can be achieved to help MAS escape from local optima. A Bayesian coalition nego- tiation algorithm is constructed, within which the Harsanyi transformation is introduced to transform the coalition problem based on incomplete information to the Bayesian-equivalent coali- tion problem based on imperfect information. The simulation results suggest that the distribution of agents' expectations of other agents' unknown information approximates to the true distribution after a finite set of generations. The comparisons indicate that the MAS cooperative coalition algo- rithm produces a significantly better utility and possesses a more effective capability of escaping from local optima than the proposal-engaged marriage algorithm and the Simulated Annealing algorithm.