In this paper, we propose a joint waveform selection and power allocation(JWSPA) strategy based on chance-constraint programming(CCP) for manned/unmanned aerial vehicle hybrid swarm(M/UAVHS) tracking a single target. ...In this paper, we propose a joint waveform selection and power allocation(JWSPA) strategy based on chance-constraint programming(CCP) for manned/unmanned aerial vehicle hybrid swarm(M/UAVHS) tracking a single target. Accordingly,the low probability of intercept(LPI) performance of system can be improved by collaboratively optimizing transmit power and waveform. For target radar cross section(RCS) prediction, we design a random RCS prediction model based on electromagnetic simulation(ES) of target. For waveform selection, we build a waveform library to adaptively manage the frequency modulation slope and pulse width of radar waveform. For power allocation,the CCP is employed to balance tracking accuracy and power resource. The Bayesian Cramér-Rao lower bound(BCRLB) is adopted as a criterion to measure target tracking accuracy. The hybrid intelli gent algorithms, in which the stochastic simulation is integrated into the genetic algorithm(GA), are used to solve the stochastic optimization problem. Simulation results demonstrate that the proposed JWSPA strategy can save more transmit power than the traditional fixed waveform scheme under the same target tracking accuracy.展开更多
Cooperative search-attack is an important application of unmanned aerial vehicle(UAV)swarm in military field.The coupling between path planning and task allocation,the heterogeneity of UAVs,and the dynamic nature of t...Cooperative search-attack is an important application of unmanned aerial vehicle(UAV)swarm in military field.The coupling between path planning and task allocation,the heterogeneity of UAVs,and the dynamic nature of task environment greatly increase the complexity and difficulty of the UAV swarm cooperative search-attack mission planning problem.Inspired by the collaborative hunting behavior of wolf pack,a distributed selforganizing method for UAV swarm search-attack mission planning is proposed.First,to solve the multi-target search problem in unknown environments,a wolf scouting behavior-inspired cooperative search algorithm for UAV swarm is designed.Second,a distributed self-organizing task allocation algorithm for UAV swarm cooperative attacking of targets is proposed by analyzing the flexible labor division behavior of wolves.By abstracting the UAV as a simple artificial wolf agent,the flexible motion planning and group task coordinating for UAV swarm can be realized by self-organizing.The effectiveness of the proposed method is verified by a set of simulation experiments,the stability and scalability are evaluated,and the integrated solution for the coupled path planning and task allocation problems for the UAV swarm cooperative search-attack task can be well performed.展开更多
To solve the problem of time difference of arrival(TDOA)positioning and tracking of targets by the unmanned aerial vehicles(UAV)swarm in future air combat,this paper adopts the TDOA positioning method and uses time di...To solve the problem of time difference of arrival(TDOA)positioning and tracking of targets by the unmanned aerial vehicles(UAV)swarm in future air combat,this paper adopts the TDOA positioning method and uses time difference sensors of the UAV swarm to locate target radiation sources.Firstly,a TDOA model for the target is set up for the UAV swarm under the condition that the error variance varies with the received signal-to-noise ratio.The accuracy of the positioning error is analyzed by geometric dilution of precision(GDOP).The D-optimality criterion of the positioning model is theoretically derived.The target is positioned and settled,and the maximum value of the Fisher information matrix determinant is used as the optimization objective function to optimize the track of the UAV in real time.Simulation results show that the track optimization improves the positioning accuracy and stability of the UAV swarm to the target.展开更多
An ant colony optimization with artificial potential field(ACOAPF)algorithm is proposed to solve the cooperative search mission planning problem of unmanned aerial vehicle(UAV)swarm.This algorithm adopts a distributed...An ant colony optimization with artificial potential field(ACOAPF)algorithm is proposed to solve the cooperative search mission planning problem of unmanned aerial vehicle(UAV)swarm.This algorithm adopts a distributed architecture where each UAV is considered as an ant and makes decision autonomously.At each decision step,the ants choose the next gird according to the state transition rule and update its own artificial potential field and pheromone map based on the current search results.Through iterations of this process,the cooperative search of UAV swarm for mission area is realized.The state transition rule is divided into two types.If the artificial potential force is larger than a threshold,the deterministic transition rule is adopted,otherwise a heuristic transition rule is used.The deterministic transition rule can ensure UAVs to avoid the threat or approach the target quickly.And the heuristics transition rule considering the pheromone and heuristic information ensures the continuous search of area with the goal of covering more unknown area and finding more targets.Finally,simulations are carried out to verify the effectiveness of the proposed ACOAPF algorithm for cooperative search mission of UAV swarm.展开更多
A decentralized task planning algorithm is proposed for heterogeneous unmanned aerial vehicle(UAV)swarm with different capabilities.The algorithm extends the consensus-based bundle algorithm(CBBA)to account for a more...A decentralized task planning algorithm is proposed for heterogeneous unmanned aerial vehicle(UAV)swarm with different capabilities.The algorithm extends the consensus-based bundle algorithm(CBBA)to account for a more realistic and complex environment.The extension of the algorithm includes handling multi-agent task that requires multiple UAVs collaboratively completed in coordination,and consideration of avoiding obstacles in task scenarios.We propose a new consensus algorithm to solve the multi-agent task allocation problem and use the Dubins algorithm to design feasible paths for UAVs to avoid obstacles and consider motion constraints.Experimental results show that the CBBA extension algorithm can converge to a conflict-free and feasible solution for multi-agent task planning problems.展开更多
Projects on unmanned aerial vehicle(UAV) swarms have been initiated in a big way in the last few years, especially from 2015 to 2016. As a result, the number of related works on UAV swarms has been on the rise, with t...Projects on unmanned aerial vehicle(UAV) swarms have been initiated in a big way in the last few years, especially from 2015 to 2016. As a result, the number of related works on UAV swarms has been on the rise, with the rate of growth dramatically accelerating since 2017. This research conducts a bibliometric analysis of robotics swarms and UAV swarms to answer the following questions:(i) Disciplines mentioned in the UAV swarms research.(ii) The future development trends and hotspots in the UAV swarms research.(iii) Tracking related outcomes in the UAV swarms research.展开更多
This paper tackles the formation-containment control problem of fixed-wing unmanned aerial vehicle(UAV)swarm with model uncertainties for dynamic target tracking in three-dimensional space in the faulty case of UAVs’...This paper tackles the formation-containment control problem of fixed-wing unmanned aerial vehicle(UAV)swarm with model uncertainties for dynamic target tracking in three-dimensional space in the faulty case of UAVs’actuator and sensor.The fixed-wing UAV swarm under consideration is organized as a“multi-leader-multi-follower”structure,in which only several leaders can obtain the dynamic target information while others only receive the neighbors’information through the communication network.To simultaneously realize the formation,containment,and dynamic target tracking,a two-layer control framework is adopted to decouple the problem into two subproblems:reference trajectory generation and trajectory tracking.In the upper layer,a distributed finite-time estimator(DFTE)is proposed to generate each UAV’s reference trajectory in accordance with the control objective.Subsequently,a distributed composite robust fault-tolerant trajectory tracking controller is developed in the lower layer,where a novel adaptive extended super-twisting(AESTW)algorithm with a finite-time extended state observer(FTESO)is involved in solving the robust trajectory tracking control problem under model uncertainties,actuator,and sensor faults.The proposed controller simultaneously guarantees rapidness and enhances the system’s robustness with fewer chattering effects.Finally,corresponding simulations are carried out to demonstrate the effectiveness and competitiveness of the proposed two-layer fault-tolerant cooperative control scheme.展开更多
It is essential to maximize capacity while satisfying the transmission time delay of unmanned aerial vehicle(UAV)swarm communication system.In order to address this challenge,a dynamic decentralized optimization mecha...It is essential to maximize capacity while satisfying the transmission time delay of unmanned aerial vehicle(UAV)swarm communication system.In order to address this challenge,a dynamic decentralized optimization mechanism is presented for the realization of joint spectrum and power(JSAP)resource allocation based on deep Q-learning networks(DQNs).Each UAV to UAV(U2U)link is regarded as an agent that is capable of identifying the optimal spectrum and power to communicate with one another.The convolutional neural network,target network,and experience replay are adopted while training.The findings of the simulation indicate that the proposed method has the potential to improve both communication capacity and probability of successful data transmission when compared with random centralized assignment and multichannel access methods.展开更多
The source location based on the hybrid time difference of arrival(TDOA)/frequency difference of arrival(FDOA) is a basic problem in wireless sensor networks, and the layout of sensors in the hybrid TDOA/FDOA position...The source location based on the hybrid time difference of arrival(TDOA)/frequency difference of arrival(FDOA) is a basic problem in wireless sensor networks, and the layout of sensors in the hybrid TDOA/FDOA positioning will greatly affect the accuracy of positioning. Using unmanned aerial vehicle(UAV) as base stations, by optimizing the trajectory of the UAV swarm, an optimal positioning configuration is formed to improve the accuracy of the target position and velocity estimation. In this paper, a hybrid TDOA/FDOA positioning model is first established, and the positioning accuracy of the hybrid TDOA/FDOA under different positioning configurations and different measurement errors is simulated by the geometric dilution of precision(GDOP) factor. Second, the Cramer-Rao lower bound(CRLB) matrix of hybrid TDOA/FDOA location under different moving states of the target is derived theoretically, the objective function of the track optimization is obtained, and the track of the UAV swarm is optimized in real time. The simulation results show that the track optimization effectively improves the accuracy of the target position and velocity estimation.展开更多
This paper studies a special defense game using unmanned aerial vehicle(UAV)swarm against a fast intruder.The fast intruder applies an offensive strategy based on the artificial potential field method and Apollonius c...This paper studies a special defense game using unmanned aerial vehicle(UAV)swarm against a fast intruder.The fast intruder applies an offensive strategy based on the artificial potential field method and Apollonius circle to scout a certain destination.As defenders,the UAVs are arranged into three layers:the forward layer,the midfield layer and the back layer.The co-defense mechanism,including the role derivation method of UAV swarm and a guidance law based on the co-defense front point,is introduced for UAV swarm to co-detect the intruder.Besides,five formations are designed for comparative analysis when ten UAVs are applied.Through Monte Carlo experiments and ablation experiment,the effectiveness of the proposed co-defense method has been verified.展开更多
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai...The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%.展开更多
The deep deterministic policy gradient(DDPG)algo-rithm is an off-policy method that combines two mainstream reinforcement learning methods based on value iteration and policy iteration.Using the DDPG algorithm,agents ...The deep deterministic policy gradient(DDPG)algo-rithm is an off-policy method that combines two mainstream reinforcement learning methods based on value iteration and policy iteration.Using the DDPG algorithm,agents can explore and summarize the environment to achieve autonomous deci-sions in the continuous state space and action space.In this paper,a cooperative defense with DDPG via swarms of unmanned aerial vehicle(UAV)is developed and validated,which has shown promising practical value in the effect of defending.We solve the sparse rewards problem of reinforcement learning pair in a long-term task by building the reward function of UAV swarms and optimizing the learning process of artificial neural network based on the DDPG algorithm to reduce the vibration in the learning process.The experimental results show that the DDPG algorithm can guide the UAVs swarm to perform the defense task efficiently,meeting the requirements of a UAV swarm for non-centralization,autonomy,and promoting the intelligent development of UAVs swarm as well as the decision-making process.展开更多
Dynamic task allocation of unmanned aerial vehicle swarms for ground targets is an important part of unmanned aerial vehicle(UAV)swarms task planning and the key technology to improve autonomy.The realization of dynam...Dynamic task allocation of unmanned aerial vehicle swarms for ground targets is an important part of unmanned aerial vehicle(UAV)swarms task planning and the key technology to improve autonomy.The realization of dynamic task allocation in UAV swarms for ground targets is very difficult because of the large uncertainty of swarms,the target and environment state,and the high real-time allocation requirements.Hence,dynamic task allocation of UAV swarms oriented to ground targets has become a key and difficult problem in the field of mission planning.In this work,a dynamic task allocation method for UAV swarms oriented to ground targets is comprehensively and systematically summarized from two aspects:the establishment of an allocation model and the solution of the allocation model.First,the basic concept and trigger scenario are introduced.Second,the research status and the advantages and disadvantages of the two allocation models are analyzed.Third,the research status and the advantages and disadvantages of several common dynamic task allocation algorithms,such as the algorithm based on market mechanisms,intelligent optimization algorithm,and clustering algorithm,are evaluated.Finally,the specific problems of the current UAV swarm dynamic task allocation method for ground targets are highlighted,and future research directions are established.This work offers important reference significance for fully understanding the current situation of UAV swarm dynamic task allocation technology.展开更多
This paper investigates a formation control problem of fixed-wing Unmanned Aerial Vehicle(UAV) swarms. A group-based hierarchical architecture is established among the UAVs, which decomposes all the UAVs into several ...This paper investigates a formation control problem of fixed-wing Unmanned Aerial Vehicle(UAV) swarms. A group-based hierarchical architecture is established among the UAVs, which decomposes all the UAVs into several distinct and non-overlapping groups. In each group, the UAVs form hierarchies with one UAV selected as the group leader. All group leaders execute coordinated path following to cooperatively handle the mission process among different groups, and the remaining followers track their direct leaders to achieve the inner-group coordination. More specifically, for a group leader, a virtual target moving along its desired path is assigned for the UAV, and an updating law is proposed to coordinate all the group leaders’ virtual targets;for a follower UAV, the distributed leader-following formation control law is proposed to make the follower’s heading angle coincide with its direct leader, while keeping the desired relative position with respect to its direct leader. The proposed control law guarantees the globally asymptotic stability of the whole closed-loop swarm system under the control input constraints of fixed-wing UAVs. Theoretical proofs and numerical simulations are provided, which corroborate the effectiveness of the proposed method.展开更多
基金This work was supported by the National Natural Science Foundation of China(62071440,61671241).
文摘In this paper, we propose a joint waveform selection and power allocation(JWSPA) strategy based on chance-constraint programming(CCP) for manned/unmanned aerial vehicle hybrid swarm(M/UAVHS) tracking a single target. Accordingly,the low probability of intercept(LPI) performance of system can be improved by collaboratively optimizing transmit power and waveform. For target radar cross section(RCS) prediction, we design a random RCS prediction model based on electromagnetic simulation(ES) of target. For waveform selection, we build a waveform library to adaptively manage the frequency modulation slope and pulse width of radar waveform. For power allocation,the CCP is employed to balance tracking accuracy and power resource. The Bayesian Cramér-Rao lower bound(BCRLB) is adopted as a criterion to measure target tracking accuracy. The hybrid intelli gent algorithms, in which the stochastic simulation is integrated into the genetic algorithm(GA), are used to solve the stochastic optimization problem. Simulation results demonstrate that the proposed JWSPA strategy can save more transmit power than the traditional fixed waveform scheme under the same target tracking accuracy.
基金supported by the National Natural Science Foundation of China(61502534)the Shaanxi Provincial Natural Science Foundation(2020JQ-493)+2 种基金the Integrative Equipment Research Project of Armed Police Force(WJ20211A030018)the Military Science Project of the National Social Science Fund(WJ2019-SKJJ-C-092)the Theoretical Research Foundation of Armed Police Engineering University(WJY202148)。
文摘Cooperative search-attack is an important application of unmanned aerial vehicle(UAV)swarm in military field.The coupling between path planning and task allocation,the heterogeneity of UAVs,and the dynamic nature of task environment greatly increase the complexity and difficulty of the UAV swarm cooperative search-attack mission planning problem.Inspired by the collaborative hunting behavior of wolf pack,a distributed selforganizing method for UAV swarm search-attack mission planning is proposed.First,to solve the multi-target search problem in unknown environments,a wolf scouting behavior-inspired cooperative search algorithm for UAV swarm is designed.Second,a distributed self-organizing task allocation algorithm for UAV swarm cooperative attacking of targets is proposed by analyzing the flexible labor division behavior of wolves.By abstracting the UAV as a simple artificial wolf agent,the flexible motion planning and group task coordinating for UAV swarm can be realized by self-organizing.The effectiveness of the proposed method is verified by a set of simulation experiments,the stability and scalability are evaluated,and the integrated solution for the coupled path planning and task allocation problems for the UAV swarm cooperative search-attack task can be well performed.
基金This work was supported by the National Natural Science Foundation of China(61502522)the Equipment Pre-Research Field Fund(JZX7Y20190253036101)+1 种基金the Equipment Pre-Research Ministry of Education Joint Fund(6141A02033703)the Hubei Provincial Natural Science Foundation(2019CFC897).
文摘To solve the problem of time difference of arrival(TDOA)positioning and tracking of targets by the unmanned aerial vehicles(UAV)swarm in future air combat,this paper adopts the TDOA positioning method and uses time difference sensors of the UAV swarm to locate target radiation sources.Firstly,a TDOA model for the target is set up for the UAV swarm under the condition that the error variance varies with the received signal-to-noise ratio.The accuracy of the positioning error is analyzed by geometric dilution of precision(GDOP).The D-optimality criterion of the positioning model is theoretically derived.The target is positioned and settled,and the maximum value of the Fisher information matrix determinant is used as the optimization objective function to optimize the track of the UAV in real time.Simulation results show that the track optimization improves the positioning accuracy and stability of the UAV swarm to the target.
基金supported by the National Natural Science Foundation of China (Nos.61973158, 61673209)the Aeronautical Science Foundation (No.2016ZA52009)
文摘An ant colony optimization with artificial potential field(ACOAPF)algorithm is proposed to solve the cooperative search mission planning problem of unmanned aerial vehicle(UAV)swarm.This algorithm adopts a distributed architecture where each UAV is considered as an ant and makes decision autonomously.At each decision step,the ants choose the next gird according to the state transition rule and update its own artificial potential field and pheromone map based on the current search results.Through iterations of this process,the cooperative search of UAV swarm for mission area is realized.The state transition rule is divided into two types.If the artificial potential force is larger than a threshold,the deterministic transition rule is adopted,otherwise a heuristic transition rule is used.The deterministic transition rule can ensure UAVs to avoid the threat or approach the target quickly.And the heuristics transition rule considering the pheromone and heuristic information ensures the continuous search of area with the goal of covering more unknown area and finding more targets.Finally,simulations are carried out to verify the effectiveness of the proposed ACOAPF algorithm for cooperative search mission of UAV swarm.
文摘A decentralized task planning algorithm is proposed for heterogeneous unmanned aerial vehicle(UAV)swarm with different capabilities.The algorithm extends the consensus-based bundle algorithm(CBBA)to account for a more realistic and complex environment.The extension of the algorithm includes handling multi-agent task that requires multiple UAVs collaboratively completed in coordination,and consideration of avoiding obstacles in task scenarios.We propose a new consensus algorithm to solve the multi-agent task allocation problem and use the Dubins algorithm to design feasible paths for UAVs to avoid obstacles and consider motion constraints.Experimental results show that the CBBA extension algorithm can converge to a conflict-free and feasible solution for multi-agent task planning problems.
文摘Projects on unmanned aerial vehicle(UAV) swarms have been initiated in a big way in the last few years, especially from 2015 to 2016. As a result, the number of related works on UAV swarms has been on the rise, with the rate of growth dramatically accelerating since 2017. This research conducts a bibliometric analysis of robotics swarms and UAV swarms to answer the following questions:(i) Disciplines mentioned in the UAV swarms research.(ii) The future development trends and hotspots in the UAV swarms research.(iii) Tracking related outcomes in the UAV swarms research.
基金the National Natural Science Foundation of China(61933010)the Natural Science Basic Research Plan in Shaanxi Province of China(2023-JC-QN-0733).
文摘This paper tackles the formation-containment control problem of fixed-wing unmanned aerial vehicle(UAV)swarm with model uncertainties for dynamic target tracking in three-dimensional space in the faulty case of UAVs’actuator and sensor.The fixed-wing UAV swarm under consideration is organized as a“multi-leader-multi-follower”structure,in which only several leaders can obtain the dynamic target information while others only receive the neighbors’information through the communication network.To simultaneously realize the formation,containment,and dynamic target tracking,a two-layer control framework is adopted to decouple the problem into two subproblems:reference trajectory generation and trajectory tracking.In the upper layer,a distributed finite-time estimator(DFTE)is proposed to generate each UAV’s reference trajectory in accordance with the control objective.Subsequently,a distributed composite robust fault-tolerant trajectory tracking controller is developed in the lower layer,where a novel adaptive extended super-twisting(AESTW)algorithm with a finite-time extended state observer(FTESO)is involved in solving the robust trajectory tracking control problem under model uncertainties,actuator,and sensor faults.The proposed controller simultaneously guarantees rapidness and enhances the system’s robustness with fewer chattering effects.Finally,corresponding simulations are carried out to demonstrate the effectiveness and competitiveness of the proposed two-layer fault-tolerant cooperative control scheme.
基金supported by the National Natural Science Foundation of China(62031017,61971221).
文摘It is essential to maximize capacity while satisfying the transmission time delay of unmanned aerial vehicle(UAV)swarm communication system.In order to address this challenge,a dynamic decentralized optimization mechanism is presented for the realization of joint spectrum and power(JSAP)resource allocation based on deep Q-learning networks(DQNs).Each UAV to UAV(U2U)link is regarded as an agent that is capable of identifying the optimal spectrum and power to communicate with one another.The convolutional neural network,target network,and experience replay are adopted while training.The findings of the simulation indicate that the proposed method has the potential to improve both communication capacity and probability of successful data transmission when compared with random centralized assignment and multichannel access methods.
基金supported by the National Natural Science Foundation of China (61502522)Equipment Pre-Research Field Fund(JZX7Y20190253036101)+1 种基金Equipment Pre-Research Ministry of Education Joint Fund (6141A02033703)Hubei Provincial Natural Scie nce Foundation (2019CFC897)。
文摘The source location based on the hybrid time difference of arrival(TDOA)/frequency difference of arrival(FDOA) is a basic problem in wireless sensor networks, and the layout of sensors in the hybrid TDOA/FDOA positioning will greatly affect the accuracy of positioning. Using unmanned aerial vehicle(UAV) as base stations, by optimizing the trajectory of the UAV swarm, an optimal positioning configuration is formed to improve the accuracy of the target position and velocity estimation. In this paper, a hybrid TDOA/FDOA positioning model is first established, and the positioning accuracy of the hybrid TDOA/FDOA under different positioning configurations and different measurement errors is simulated by the geometric dilution of precision(GDOP) factor. Second, the Cramer-Rao lower bound(CRLB) matrix of hybrid TDOA/FDOA location under different moving states of the target is derived theoretically, the objective function of the track optimization is obtained, and the track of the UAV swarm is optimized in real time. The simulation results show that the track optimization effectively improves the accuracy of the target position and velocity estimation.
基金the Aeronautical Science Foundation of China(2020Z023053001).
文摘This paper studies a special defense game using unmanned aerial vehicle(UAV)swarm against a fast intruder.The fast intruder applies an offensive strategy based on the artificial potential field method and Apollonius circle to scout a certain destination.As defenders,the UAVs are arranged into three layers:the forward layer,the midfield layer and the back layer.The co-defense mechanism,including the role derivation method of UAV swarm and a guidance law based on the co-defense front point,is introduced for UAV swarm to co-detect the intruder.Besides,five formations are designed for comparative analysis when ten UAVs are applied.Through Monte Carlo experiments and ablation experiment,the effectiveness of the proposed co-defense method has been verified.
基金supported by the Aeronautical Science Foundation(2017ZC53033).
文摘The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%.
基金supported by the Key Research and Development Program of Shaanxi(2022GY-089)the Natural Science Basic Research Program of Shaanxi(2022JQ-593).
文摘The deep deterministic policy gradient(DDPG)algo-rithm is an off-policy method that combines two mainstream reinforcement learning methods based on value iteration and policy iteration.Using the DDPG algorithm,agents can explore and summarize the environment to achieve autonomous deci-sions in the continuous state space and action space.In this paper,a cooperative defense with DDPG via swarms of unmanned aerial vehicle(UAV)is developed and validated,which has shown promising practical value in the effect of defending.We solve the sparse rewards problem of reinforcement learning pair in a long-term task by building the reward function of UAV swarms and optimizing the learning process of artificial neural network based on the DDPG algorithm to reduce the vibration in the learning process.The experimental results show that the DDPG algorithm can guide the UAVs swarm to perform the defense task efficiently,meeting the requirements of a UAV swarm for non-centralization,autonomy,and promoting the intelligent development of UAVs swarm as well as the decision-making process.
基金This work was partially supported by the Military Science Project of National Social Science Foundation(No.2019-SKJJ-C-092)the National Natural Science Foundation of China(No.61502534)+3 种基金the Natural Science Foundation of Shanxi Province(No.2020JQ-493)Military Equipment Research Project(No.WJ2020A020029)Military Theory Project of PAP(No.WJJY21JL0618)Research Foundation of Armed Police Force Engineering University(Nos.WJY202148 and JLY2020084).
文摘Dynamic task allocation of unmanned aerial vehicle swarms for ground targets is an important part of unmanned aerial vehicle(UAV)swarms task planning and the key technology to improve autonomy.The realization of dynamic task allocation in UAV swarms for ground targets is very difficult because of the large uncertainty of swarms,the target and environment state,and the high real-time allocation requirements.Hence,dynamic task allocation of UAV swarms oriented to ground targets has become a key and difficult problem in the field of mission planning.In this work,a dynamic task allocation method for UAV swarms oriented to ground targets is comprehensively and systematically summarized from two aspects:the establishment of an allocation model and the solution of the allocation model.First,the basic concept and trigger scenario are introduced.Second,the research status and the advantages and disadvantages of the two allocation models are analyzed.Third,the research status and the advantages and disadvantages of several common dynamic task allocation algorithms,such as the algorithm based on market mechanisms,intelligent optimization algorithm,and clustering algorithm,are evaluated.Finally,the specific problems of the current UAV swarm dynamic task allocation method for ground targets are highlighted,and future research directions are established.This work offers important reference significance for fully understanding the current situation of UAV swarm dynamic task allocation technology.
基金supported in part by National Natural Science Foundation of China (Nos. 61973309, 61801494 and61702528)in part by Hunan Provincial Innovation Foundation for Postgraduate,China (No. CX2017B014)。
文摘This paper investigates a formation control problem of fixed-wing Unmanned Aerial Vehicle(UAV) swarms. A group-based hierarchical architecture is established among the UAVs, which decomposes all the UAVs into several distinct and non-overlapping groups. In each group, the UAVs form hierarchies with one UAV selected as the group leader. All group leaders execute coordinated path following to cooperatively handle the mission process among different groups, and the remaining followers track their direct leaders to achieve the inner-group coordination. More specifically, for a group leader, a virtual target moving along its desired path is assigned for the UAV, and an updating law is proposed to coordinate all the group leaders’ virtual targets;for a follower UAV, the distributed leader-following formation control law is proposed to make the follower’s heading angle coincide with its direct leader, while keeping the desired relative position with respect to its direct leader. The proposed control law guarantees the globally asymptotic stability of the whole closed-loop swarm system under the control input constraints of fixed-wing UAVs. Theoretical proofs and numerical simulations are provided, which corroborate the effectiveness of the proposed method.