Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinfor...Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinforce-ment learning(DRL)theory and an improved Multi-Agent Deep Deterministic Policy Gradient(MADDPG-D2)algorithm with a dual experience replay pool and a dual noise based on multi-agent architecture is proposed to improve the efficiency of DTA.The algorithm is based on the traditional Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm,and considers the introduction of a double noise mechanism to increase the action exploration space in the early stage of the algorithm,and the introduction of a double experience pool to improve the data utilization rate;at the same time,in order to accelerate the training speed and efficiency of the agents,and to solve the cold-start problem of the training,the a priori knowledge technology is applied to the training of the algorithm.Finally,the MADDPG-D2 algorithm is compared and analyzed based on the digital battlefield of ground and air confrontation.The experimental results show that the agents trained by the MADDPG-D2 algorithm have higher win rates and average rewards,can utilize the resources more reasonably,and better solve the problem of the traditional single agent algorithms facing the difficulty of solving the problem in the high-dimensional decision space.The MADDPG-D2 algorithm based on multi-agent architecture proposed in this paper has certain superiority and rationality in DTA.展开更多
In this paper,we consider a multi-UAV surveillance scenario where a team of unmanned aerial vehicles(UAVs)synchronously covers an area for monitoring the ground conditions.In this scenario,we adopt the leader-follower...In this paper,we consider a multi-UAV surveillance scenario where a team of unmanned aerial vehicles(UAVs)synchronously covers an area for monitoring the ground conditions.In this scenario,we adopt the leader-follower control mode and propose a modified Lyapunov guidance vector field(LGVF)approach for improving the precision of surveillance trajectory tracking.Then,in order to adopt to poor communication conditions,we propose a prediction-based synchronization method for keeping the formation consistently.Moreover,in order to adapt the multi-UAV system to dynamic and uncertain environment,this paper proposes a hierarchical dynamic task scheduling architecture.In this architecture,we firstly classify all the algorithms that perform tasks according to their functions,and then modularize the algorithms based on plugin technology.Afterwards,integrating the behavior model and plugin technique,this paper designs a three-layer control flow,which can efficiently achieve dynamic task scheduling.In order to verify the effectiveness of our architecture,we consider a multi-UAV traffic monitoring scenario and design several cases to demonstrate the online adjustment from three levels,respectively.展开更多
Traditionally, heuristic re-planning algorithms are used to tackle the problem of dynamic task planning for multiple satellites. However, the traditional heuristic strategies depend on the concrete tasks, which often ...Traditionally, heuristic re-planning algorithms are used to tackle the problem of dynamic task planning for multiple satellites. However, the traditional heuristic strategies depend on the concrete tasks, which often affect the result’s optimality. Noticing that the historical information of cooperative task planning will impact the latter planning results, we propose a hybrid learning algorithm for dynamic multi-satellite task planning, which is based on the multi-agent reinforcement learning of policy iteration and the transfer learning. The reinforcement learning strategy of each satellite is described with neural networks. The policy neural network individuals with the best topological structure and weights are found by applying co-evolutionary search iteratively. To avoid the failure of the historical learning caused by the randomly occurring observation requests, a novel approach is proposed to balance the quality and efficiency of the task planning, which converts the historical learning strategy to the current initial learning strategy by applying the transfer learning algorithm. The simulations and analysis show the feasibility and adaptability of the proposed approach especially for the situation with randomly occurring observation requests.展开更多
An algorithm is proposed for scheduling dependent tasks in time-varying heterogeneous multiprocessor systems, in which computational power and links between processors are allowed to change over time. Link contention ...An algorithm is proposed for scheduling dependent tasks in time-varying heterogeneous multiprocessor systems, in which computational power and links between processors are allowed to change over time. Link contention is considered in the multiprocessor scheduling problem. A linear switching-state space-modeling paradigm is introduced to enable theoretical analysis from a system engineering perspective. Theoretical analysis of this model shows its robustness against changes in processing power and link failure. The proposed algorithm uses a fuzzy decision-making procedure to handle changes in the multiprocessor system. The efficiency of the proposed algorithm is illustrated by several random experiments and comparison against a recent benchmark approach. The results show up to 18% average improvement in makespan, especially for larger scale systems.展开更多
As the ability of a single agent is limited while information and resources in multi-agent systems are distributed, cooperation is necessary for agents to accomplish a complex task. In the open and changeable environm...As the ability of a single agent is limited while information and resources in multi-agent systems are distributed, cooperation is necessary for agents to accomplish a complex task. In the open and changeable environment on the Internet, it is of great significance to research a system flexible and capable in dynamic evolution that can find a collaboration method for agents which can be used in dynamic evolution process. With such a method, agents accomplish tasks for an overall target and at the same time, the collaborative relationship of agents can be adjusted with the change of environment. A method of task decomposition and collaboration of agents by improved contract net protocol is introduced. Finally, analysis on the result of the experiments is performed to verify the improved contract net protocol can greatly increase the efficiency of communication and collaboration in multi-agent system.展开更多
This paper presents a new soliton approach to hyper-distributed hyper-parallel self-organizing dynamic scheduling for task allocations among rational autonomous agents in a multi-agent system (MAS). This approach can ...This paper presents a new soliton approach to hyper-distributed hyper-parallel self-organizing dynamic scheduling for task allocations among rational autonomous agents in a multi-agent system (MAS). This approach can overcome many drawbacks of other mechanisms currently used for coalition formation and cooperation in MAS. The thorny problems, such as overabundant bid, social behaviors, colony intelligence, variable neighbors, and interdepen-dency, can easily be treated by using the proposed approach, whereas they are very difficult for other conventional approaches. The simulation on a distributed transport scheduling sys-tem shows the soliton approach featured by hyper-parallelism, effectiveness, openness, dynamic alignment and adaption.展开更多
Multi-core processor is widely used as the running platform for safety-critical real-time systems such as spacecraft,and various types of real-time tasks are dynamically added at runtime.In order to improve the utiliz...Multi-core processor is widely used as the running platform for safety-critical real-time systems such as spacecraft,and various types of real-time tasks are dynamically added at runtime.In order to improve the utilization of multi-core processors and ensure the real-time performance of the system,it is necessary to adopt a reasonable real-time task allocation method,but the existing methods are only for single-core processors or the performance is too low to be applicable.Aiming at the task allocation problem when mixed real-time tasks are dynamically added,we propose a heuristic mixed real-time task allocation algorithm of virtual utilization VU-WF(Virtual Utilization Worst Fit)in multi-core processor.First,a 4-tuple task model is established to describe the fixedpoint task and the sporadic task in a unified manner.Then,a VDS(Virtual Deferral Server)for serving execution requests of fixed-point task is constructed and a schedulability test of the mixed task set is derived.Finally,combined with the analysis of VDS's capacity,VU-WF is proposed,which selects cores in ascending order of virtual utilization for the schedulability test.Experiments show that the overall performance of VU-WF is better than available algorithms,not only has a good schedulable ratio and load balancing but also has the lowest runtime overhead.In a 4-core processor,compared with available algorithms of the same schedulability ratio,the load balancing is improved by 73.9%,and the runtime overhead is reduced by 38.3%.In addition,we also develop a visual multi-core mixed task scheduling simulator RT-MCSS(open source)to facilitate the design and verification of multi-core scheduling for users.As the high performance,VU-WF can be widely used in resource-constrained and safety-critical real-time systems,such as spacecraft,self-driving cars,industrial robots,etc.展开更多
Transient stability batch assessment(TSBA)is es-sential for dynamic security check in both power system planning and day-ahead dispatch.It is also a necessary technique to generate sufficient training data for data-dr...Transient stability batch assessment(TSBA)is es-sential for dynamic security check in both power system planning and day-ahead dispatch.It is also a necessary technique to generate sufficient training data for data-driven online transient stability assessment(TSA).However,most existing work suffers from various problems including high computational burden,low model adaptability,and low performance robustness.Therefore,it is still a significant challenge in modern power systems,with numerous scenarios(e.g.,operating conditions and"N-k"contin-gencies)to be assessed at the same time.The purpose of this work is to construct a data-driven method to early terminate time-domain simulation(TDS)and dynamically schedule TSBA task queue a prior,in order to reduce computational burden without compromising accuracy.To achieve this goal,a time-adaptive cas-caded convolutional neural networks(CNNs)model is developed to predict stability and early terminate TDS.Additionally,an information entropy based prioritization strategy is designed to distinguish informative samples,dynamically schedule TSBA task queue and timely update model,thus further reducing simulation time.Case study in IEEE 39-bus system validates the effectiveness of the proposed method.展开更多
基金This research was funded by the Project of the National Natural Science Foundation of China,Grant Number 62106283.
文摘Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinforce-ment learning(DRL)theory and an improved Multi-Agent Deep Deterministic Policy Gradient(MADDPG-D2)algorithm with a dual experience replay pool and a dual noise based on multi-agent architecture is proposed to improve the efficiency of DTA.The algorithm is based on the traditional Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm,and considers the introduction of a double noise mechanism to increase the action exploration space in the early stage of the algorithm,and the introduction of a double experience pool to improve the data utilization rate;at the same time,in order to accelerate the training speed and efficiency of the agents,and to solve the cold-start problem of the training,the a priori knowledge technology is applied to the training of the algorithm.Finally,the MADDPG-D2 algorithm is compared and analyzed based on the digital battlefield of ground and air confrontation.The experimental results show that the agents trained by the MADDPG-D2 algorithm have higher win rates and average rewards,can utilize the resources more reasonably,and better solve the problem of the traditional single agent algorithms facing the difficulty of solving the problem in the high-dimensional decision space.The MADDPG-D2 algorithm based on multi-agent architecture proposed in this paper has certain superiority and rationality in DTA.
基金Project(2017YFB1301104)supported by the National Key Research and Development Program of ChinaProjects(61906212,61802426)supported by the National Natural Science Foundation of China。
文摘In this paper,we consider a multi-UAV surveillance scenario where a team of unmanned aerial vehicles(UAVs)synchronously covers an area for monitoring the ground conditions.In this scenario,we adopt the leader-follower control mode and propose a modified Lyapunov guidance vector field(LGVF)approach for improving the precision of surveillance trajectory tracking.Then,in order to adopt to poor communication conditions,we propose a prediction-based synchronization method for keeping the formation consistently.Moreover,in order to adapt the multi-UAV system to dynamic and uncertain environment,this paper proposes a hierarchical dynamic task scheduling architecture.In this architecture,we firstly classify all the algorithms that perform tasks according to their functions,and then modularize the algorithms based on plugin technology.Afterwards,integrating the behavior model and plugin technique,this paper designs a three-layer control flow,which can efficiently achieve dynamic task scheduling.In order to verify the effectiveness of our architecture,we consider a multi-UAV traffic monitoring scenario and design several cases to demonstrate the online adjustment from three levels,respectively.
文摘Traditionally, heuristic re-planning algorithms are used to tackle the problem of dynamic task planning for multiple satellites. However, the traditional heuristic strategies depend on the concrete tasks, which often affect the result’s optimality. Noticing that the historical information of cooperative task planning will impact the latter planning results, we propose a hybrid learning algorithm for dynamic multi-satellite task planning, which is based on the multi-agent reinforcement learning of policy iteration and the transfer learning. The reinforcement learning strategy of each satellite is described with neural networks. The policy neural network individuals with the best topological structure and weights are found by applying co-evolutionary search iteratively. To avoid the failure of the historical learning caused by the randomly occurring observation requests, a novel approach is proposed to balance the quality and efficiency of the task planning, which converts the historical learning strategy to the current initial learning strategy by applying the transfer learning algorithm. The simulations and analysis show the feasibility and adaptability of the proposed approach especially for the situation with randomly occurring observation requests.
文摘An algorithm is proposed for scheduling dependent tasks in time-varying heterogeneous multiprocessor systems, in which computational power and links between processors are allowed to change over time. Link contention is considered in the multiprocessor scheduling problem. A linear switching-state space-modeling paradigm is introduced to enable theoretical analysis from a system engineering perspective. Theoretical analysis of this model shows its robustness against changes in processing power and link failure. The proposed algorithm uses a fuzzy decision-making procedure to handle changes in the multiprocessor system. The efficiency of the proposed algorithm is illustrated by several random experiments and comparison against a recent benchmark approach. The results show up to 18% average improvement in makespan, especially for larger scale systems.
基金Projects(61173026,61373045,61202039)supported by the National Natural Science Foundation of ChinaProjects(K5051223008,BDY221411)supported by the Fundamental Research Funds for the Central Universities of ChinaProject(2012AA02A603)supported by the High-Tech Research and Development Program of China
文摘As the ability of a single agent is limited while information and resources in multi-agent systems are distributed, cooperation is necessary for agents to accomplish a complex task. In the open and changeable environment on the Internet, it is of great significance to research a system flexible and capable in dynamic evolution that can find a collaboration method for agents which can be used in dynamic evolution process. With such a method, agents accomplish tasks for an overall target and at the same time, the collaborative relationship of agents can be adjusted with the change of environment. A method of task decomposition and collaboration of agents by improved contract net protocol is introduced. Finally, analysis on the result of the experiments is performed to verify the improved contract net protocol can greatly increase the efficiency of communication and collaboration in multi-agent system.
基金the National Natural Science Foundation of China under grant No. 60073008, the NKBRSF of China under grant No. G1999032707 (973
文摘This paper presents a new soliton approach to hyper-distributed hyper-parallel self-organizing dynamic scheduling for task allocations among rational autonomous agents in a multi-agent system (MAS). This approach can overcome many drawbacks of other mechanisms currently used for coalition formation and cooperation in MAS. The thorny problems, such as overabundant bid, social behaviors, colony intelligence, variable neighbors, and interdepen-dency, can easily be treated by using the proposed approach, whereas they are very difficult for other conventional approaches. The simulation on a distributed transport scheduling sys-tem shows the soliton approach featured by hyper-parallelism, effectiveness, openness, dynamic alignment and adaption.
文摘Multi-core processor is widely used as the running platform for safety-critical real-time systems such as spacecraft,and various types of real-time tasks are dynamically added at runtime.In order to improve the utilization of multi-core processors and ensure the real-time performance of the system,it is necessary to adopt a reasonable real-time task allocation method,but the existing methods are only for single-core processors or the performance is too low to be applicable.Aiming at the task allocation problem when mixed real-time tasks are dynamically added,we propose a heuristic mixed real-time task allocation algorithm of virtual utilization VU-WF(Virtual Utilization Worst Fit)in multi-core processor.First,a 4-tuple task model is established to describe the fixedpoint task and the sporadic task in a unified manner.Then,a VDS(Virtual Deferral Server)for serving execution requests of fixed-point task is constructed and a schedulability test of the mixed task set is derived.Finally,combined with the analysis of VDS's capacity,VU-WF is proposed,which selects cores in ascending order of virtual utilization for the schedulability test.Experiments show that the overall performance of VU-WF is better than available algorithms,not only has a good schedulable ratio and load balancing but also has the lowest runtime overhead.In a 4-core processor,compared with available algorithms of the same schedulability ratio,the load balancing is improved by 73.9%,and the runtime overhead is reduced by 38.3%.In addition,we also develop a visual multi-core mixed task scheduling simulator RT-MCSS(open source)to facilitate the design and verification of multi-core scheduling for users.As the high performance,VU-WF can be widely used in resource-constrained and safety-critical real-time systems,such as spacecraft,self-driving cars,industrial robots,etc.
基金This work was supported by China scholarship council under Grant 201906320221.
文摘Transient stability batch assessment(TSBA)is es-sential for dynamic security check in both power system planning and day-ahead dispatch.It is also a necessary technique to generate sufficient training data for data-driven online transient stability assessment(TSA).However,most existing work suffers from various problems including high computational burden,low model adaptability,and low performance robustness.Therefore,it is still a significant challenge in modern power systems,with numerous scenarios(e.g.,operating conditions and"N-k"contin-gencies)to be assessed at the same time.The purpose of this work is to construct a data-driven method to early terminate time-domain simulation(TDS)and dynamically schedule TSBA task queue a prior,in order to reduce computational burden without compromising accuracy.To achieve this goal,a time-adaptive cas-caded convolutional neural networks(CNNs)model is developed to predict stability and early terminate TDS.Additionally,an information entropy based prioritization strategy is designed to distinguish informative samples,dynamically schedule TSBA task queue and timely update model,thus further reducing simulation time.Case study in IEEE 39-bus system validates the effectiveness of the proposed method.