When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian...When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian networks. In this paper, a new monotonic constraint model is proposed to represent a type of common domain knowledge. And then, the monotonic constraint estimation algorithm is proposed to learn the parameters with the monotonic constraint model. In order to demonstrate the superiority of the proposed algorithm, series of experiments are carried out. The experiment results show that the proposed algorithm is able to obtain more accurate parameters compared to some existing algorithms while the complexity is not the highest.展开更多
An ant colony optimization (ACO)-simulated annealing (SA)-based algorithm is developed for the target assignment problem (TAP) in the air defense (AD) command and control (C2) system of surface to air missi...An ant colony optimization (ACO)-simulated annealing (SA)-based algorithm is developed for the target assignment problem (TAP) in the air defense (AD) command and control (C2) system of surface to air missile (SAM) tactical unit. The accomplishment process of target assignment (TA) task is analyzed. A firing advantage degree (FAD) concept of fire unit (FU) intercepting targets is put forward and its evaluation model is established by using a linear weighted synthetic method. A TA optimization model is presented and its solving algorithms are designed respectively based on ACO and SA. A hybrid optimization strategy is presented and developed synthesizing the merits of ACO and SA. The simulation examples show that the model and algorithms can meet the solving requirement of TAP in AD combat.展开更多
This article introduces a fleet composition algorithm for a fleet of intermediate carriers, which should deliver a swarm of miniature unmanned aerial vehicles (mini-UAVs) to a mission area. The algorithm is based on...This article introduces a fleet composition algorithm for a fleet of intermediate carriers, which should deliver a swarm of miniature unmanned aerial vehicles (mini-UAVs) to a mission area. The algorithm is based on the sequential solution of several knapsack problems with various constraints. The algorithm allows both to form an initial set of required types of intermediate carriers, and to generate a fleet of intermediate carriers. The formation of a fleet of intermediate carriers to solve a suppression of enemy air defense (SEAD) problem is presented to illustrate the proposed algorithm.展开更多
The coordinated Bayesian optimization algorithm(CBOA) is proposed according to the characteristics of the function independence,conformity and supplementary between the electronic countermeasure(ECM) and the firep...The coordinated Bayesian optimization algorithm(CBOA) is proposed according to the characteristics of the function independence,conformity and supplementary between the electronic countermeasure(ECM) and the firepower attack systems.The selection criteria are combinations of probabilities of individual fitness and coordinated degree and can select choiceness individual to construct Bayesian network that manifest population evolution by producing the new chromosome.Thus the CBOA cannot only guarantee the effective pattern coordinated decision-making mechanism between the populations,but also maintain the population multiplicity,and enhance the algorithm performance.The simulation result confirms the algorithm validity.展开更多
As an advanced combat weapon,Unmanned Aerial Vehicles(UAVs)have been widely used in military wars.In this paper,we formulated the Autonomous Navigation Control(ANC)problem of UAVs as a Markov Decision Process(MDP)and ...As an advanced combat weapon,Unmanned Aerial Vehicles(UAVs)have been widely used in military wars.In this paper,we formulated the Autonomous Navigation Control(ANC)problem of UAVs as a Markov Decision Process(MDP)and proposed a novel Deep Reinforcement Learning(DRL)method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments.To solve the problem of limited training experience,the proposed Imaginary Filtered Hindsight Experience Replay(IFHER)generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences.The welldesigned goal,episode,and quality filtering strategies ensure that only high-quality augmented experiences can be stored,while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities.By training in a complex environment constructed based on the parameters of a real UAV,the proposed IFHER algorithm improves the convergence speed by 28.99%and the convergence result by 11.57%compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm.The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent.Moreover,the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.展开更多
Unmanned Aerial Vehicles(UAVs)play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a ...Unmanned Aerial Vehicles(UAVs)play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a suitable method to solve the UAV Autonomous Motion Planning(AMP)problem can improve the success rate of UAV missions to a certain extent.In recent years,many studies have used Deep Reinforcement Learning(DRL)methods to address the AMP problem and have achieved good results.From the perspective of sampling,this paper designs a sampling method with double-screening,combines it with the Deep Deterministic Policy Gradient(DDPG)algorithm,and proposes the Relevant Experience Learning-DDPG(REL-DDPG)algorithm.The REL-DDPG algorithm uses a Prioritized Experience Replay(PER)mechanism to break the correlation of continuous experiences in the experience pool,finds the experiences most similar to the current state to learn according to the theory in human education,and expands the influence of the learning process on action selection at the current state.All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV.The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm,while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.展开更多
In this paper, a model-based adaptive mobility control method for an Unmanned Aerial Vehicle(UAV) acting as a communication relay is presented, which is intended to improve the network performance in airborne multi-us...In this paper, a model-based adaptive mobility control method for an Unmanned Aerial Vehicle(UAV) acting as a communication relay is presented, which is intended to improve the network performance in airborne multi-user systems. The mobility control problem is addressed by jointly considering unknown Radio Frequency(RF) channel parameters, unknown multi-user mobility, and non-available Angle of Arrival(AoA) information of the received signal. A Kalman filter and a least-square-based estimation algorithm are used to predict the future user positions and estimate the RF channel parameters between the users and the UAV, respectively. Two different relay application cases are considered: end-to-end and multi-user communications. A line search algorithm is proposed for the former, with its stability given and proven, whereas a simplified gradient-based algorithm is proposed for the latter to provide a target relay position at each decision time step, decreasing the two-dimensional search to a one-dimensional search. Simulation results show that the proposed mobility control algorithms can drive the UAV to reach or track the optimal relay position movement, as well as improving network performance. The proposed method reflects the properties of using different metrics as objective network performance functions.展开更多
基金supported by the National Natural Science Foundation of China(6130513361573285)the Fundamental Research Funds for the Central Universities(3102016CG002)
文摘When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian networks. In this paper, a new monotonic constraint model is proposed to represent a type of common domain knowledge. And then, the monotonic constraint estimation algorithm is proposed to learn the parameters with the monotonic constraint model. In order to demonstrate the superiority of the proposed algorithm, series of experiments are carried out. The experiment results show that the proposed algorithm is able to obtain more accurate parameters compared to some existing algorithms while the complexity is not the highest.
基金supported by the National Aviation Science Foundation of China(20090196002)
文摘An ant colony optimization (ACO)-simulated annealing (SA)-based algorithm is developed for the target assignment problem (TAP) in the air defense (AD) command and control (C2) system of surface to air missile (SAM) tactical unit. The accomplishment process of target assignment (TA) task is analyzed. A firing advantage degree (FAD) concept of fire unit (FU) intercepting targets is put forward and its evaluation model is established by using a linear weighted synthetic method. A TA optimization model is presented and its solving algorithms are designed respectively based on ACO and SA. A hybrid optimization strategy is presented and developed synthesizing the merits of ACO and SA. The simulation examples show that the model and algorithms can meet the solving requirement of TAP in AD combat.
基金supported by the National Natural Science Foundation of China(60774064)the Aerospace Science Foundation (20085153015)
文摘This article introduces a fleet composition algorithm for a fleet of intermediate carriers, which should deliver a swarm of miniature unmanned aerial vehicles (mini-UAVs) to a mission area. The algorithm is based on the sequential solution of several knapsack problems with various constraints. The algorithm allows both to form an initial set of required types of intermediate carriers, and to generate a fleet of intermediate carriers. The formation of a fleet of intermediate carriers to solve a suppression of enemy air defense (SEAD) problem is presented to illustrate the proposed algorithm.
基金supported by the National Natural Science Foundation of China (10377014)the Innovation Foundation of Northwestern Polytechnical university (2007KJ01027)
文摘The coordinated Bayesian optimization algorithm(CBOA) is proposed according to the characteristics of the function independence,conformity and supplementary between the electronic countermeasure(ECM) and the firepower attack systems.The selection criteria are combinations of probabilities of individual fitness and coordinated degree and can select choiceness individual to construct Bayesian network that manifest population evolution by producing the new chromosome.Thus the CBOA cannot only guarantee the effective pattern coordinated decision-making mechanism between the populations,but also maintain the population multiplicity,and enhance the algorithm performance.The simulation result confirms the algorithm validity.
基金co-supported by the National Natural Science Foundation of China(Nos.62003267 and 61573285)the Natural Science Basic Research Plan in Shaanxi Province of China(No.2020JQ-220)+1 种基金the Open Project of Science and Technology on Electronic Information Control Laboratory,China(No.JS20201100339)the Open Project of Science and Technology on Electromagnetic Space Operations and Applications Laboratory,China(No.JS20210586512).
文摘As an advanced combat weapon,Unmanned Aerial Vehicles(UAVs)have been widely used in military wars.In this paper,we formulated the Autonomous Navigation Control(ANC)problem of UAVs as a Markov Decision Process(MDP)and proposed a novel Deep Reinforcement Learning(DRL)method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments.To solve the problem of limited training experience,the proposed Imaginary Filtered Hindsight Experience Replay(IFHER)generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences.The welldesigned goal,episode,and quality filtering strategies ensure that only high-quality augmented experiences can be stored,while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities.By training in a complex environment constructed based on the parameters of a real UAV,the proposed IFHER algorithm improves the convergence speed by 28.99%and the convergence result by 11.57%compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm.The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent.Moreover,the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.
基金co-supported by the National Natural Science Foundation of China(Nos.62003267,61573285)the Aeronautical Science Foundation of China(ASFC)(No.20175553027)Natural Science Basic Research Plan in Shaanxi Province of China(No.2020JQ-220)。
文摘Unmanned Aerial Vehicles(UAVs)play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a suitable method to solve the UAV Autonomous Motion Planning(AMP)problem can improve the success rate of UAV missions to a certain extent.In recent years,many studies have used Deep Reinforcement Learning(DRL)methods to address the AMP problem and have achieved good results.From the perspective of sampling,this paper designs a sampling method with double-screening,combines it with the Deep Deterministic Policy Gradient(DDPG)algorithm,and proposes the Relevant Experience Learning-DDPG(REL-DDPG)algorithm.The REL-DDPG algorithm uses a Prioritized Experience Replay(PER)mechanism to break the correlation of continuous experiences in the experience pool,finds the experiences most similar to the current state to learn according to the theory in human education,and expands the influence of the learning process on action selection at the current state.All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV.The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm,while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.
基金supported by the National Natural Science Foundation of China (No. 61573285)
文摘In this paper, a model-based adaptive mobility control method for an Unmanned Aerial Vehicle(UAV) acting as a communication relay is presented, which is intended to improve the network performance in airborne multi-user systems. The mobility control problem is addressed by jointly considering unknown Radio Frequency(RF) channel parameters, unknown multi-user mobility, and non-available Angle of Arrival(AoA) information of the received signal. A Kalman filter and a least-square-based estimation algorithm are used to predict the future user positions and estimate the RF channel parameters between the users and the UAV, respectively. Two different relay application cases are considered: end-to-end and multi-user communications. A line search algorithm is proposed for the former, with its stability given and proven, whereas a simplified gradient-based algorithm is proposed for the latter to provide a target relay position at each decision time step, decreasing the two-dimensional search to a one-dimensional search. Simulation results show that the proposed mobility control algorithms can drive the UAV to reach or track the optimal relay position movement, as well as improving network performance. The proposed method reflects the properties of using different metrics as objective network performance functions.