Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devo...Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decisionmaking policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods.Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.展开更多
In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation eval...In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation evaluation function with strong guidance,then trains the Long Short-Term Memory(LSTM)under the framework of Deep Q Network(DQN)for air combat maneuvering decision-making.Considering the continuity between adjacent situations,the method takes multiple consecutive situations as one input of the neural network.To reflect the difference between adjacent situations,the method takes the difference of situation evaluation value as the reward of reinforcement learning.In different scenarios,the algorithm proposed in this paper is compared with the algorithm based on the Fully Neural Network(FNN)and the algorithm based on statistical principles respectively.The results show that,compared with the FNN algorithm,the algorithm proposed in this paper is more accurate and forwardlooking.Compared with the algorithm based on the statistical principles,the decision-making of the algorithm proposed in this paper is more efficient and its real-time performance is better.展开更多
Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The a...Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The autonomous maneuvering model ofUAV is established byMarkovDecision Process.The Twin DelayedDeep Deterministic Policy Gradient(TD3)algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm in deep reinforcement learning are used to train the model,and the experimental results of the two algorithms are analyzed and compared.The simulation experiment results show that compared with the DDPG algorithm,the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems.The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position,speed,and relative azimuth,adjust their actions to approach,and successfully strike the enemy,providing a new method for UAVs to make intelligent maneuvering decisions during air combat.展开更多
The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-ma...The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-making among game players in simulated confrontation scenarios within a random interference environment.It considers the possible risks that random disturbances may pose to the autonomous decision-making of game players,as well as the impact of participants’manipulative behaviors on the state changes of the players.A nonlinear mathematical model is established to describe the strategy decision-making process of the participants in this scenario.Subsequently,the strategy selection interaction relationship,strategy evolution stability,and dynamic decision-making process of the game players are investigated and verified by simulation experiments.The results show that maneuver-related parameters and random environmental interference factors have different effects on the selection and evolutionary speed of the agent’s strategies.Especially in a highly uncertain environment,even small information asymmetry or miscalculation may have a significant impact on decision-making.This also confirms the feasibility and effectiveness of the method proposed in the paper,which can better explain the behavioral decision-making process of the agent in the interaction process.This study provides feasibility analysis ideas and theoretical references for improving multi-agent interactive decision-making and the interpretability of the game system model.展开更多
Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net...Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.展开更多
Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning frame...Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning framework for autonomous driving based on a hybrid data-and model-driven method.First,a data-driven decision-making module based on deep reinforcement learning(DRL)is developed to pursue a rational driving performance as much as possible.Then,model predictive control(MPC)is employed to execute both longitudinal and lateral motion planning tasks.Multiple constraints are defined according to the vehicle’s physical limit to meet the driving task requirements.Finally,two principles of safety and rationality for the self-evolution of autonomous driving are proposed.A motion envelope is established and embedded into a rational exploration and exploitation scheme,which filters out unreasonable experiences by masking unsafe actions so as to collect high-quality training data for the DRL agent.Experiments with a high-fidelity vehicle model and MATLAB/Simulink co-simulation environment are conducted,and the results show that the proposed online-evolution framework is able to generate safer,more rational,and more efficient driving action in a real-world environment.展开更多
While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present...While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present a novel robust reinforcement learning approach with safety guarantees to attain trustworthy decision-making for autonomous vehicles.The proposed technique ensures decision trustworthiness in terms of policy robustness and collision safety.Specifically,an adversary model is learned online to simulate the worst-case uncertainty by approximating the optimal adversarial perturbations on the observed states and environmental dynamics.In addition,an adversarial robust actor-critic algorithm is developed to enable the agent to learn robust policies against perturbations in observations and dynamics.Moreover,we devise a safety mask to guarantee the collision safety of the autonomous driving agent during both the training and testing processes using an interpretable knowledge model known as the Responsibility-Sensitive Safety Model.Finally,the proposed approach is evaluated through both simulations and experiments.These results indicate that the autonomous driving agent can make trustworthy decisions and drastically reduce the number of collisions through robust safety policies.展开更多
Due to ever-growing soccer data collection approaches and progressing artificial intelligence(AI) methods, soccer analysis, evaluation, and decision-making have received increasing interest from not only the professio...Due to ever-growing soccer data collection approaches and progressing artificial intelligence(AI) methods, soccer analysis, evaluation, and decision-making have received increasing interest from not only the professional sports analytics realm but also the academic AI research community. AI brings gamechanging approaches for soccer analytics where soccer has been a typical benchmark for AI research. The combination has been an emerging topic. In this paper, soccer match analytics are taken as a complete observation-orientation-decision-action(OODA) loop.In addition, as in AI frameworks such as that for reinforcement learning, interacting with a virtual environment enables an evolving model. Therefore, both soccer analytics in the real world and virtual domains are discussed. With the intersection of the OODA loop and the real-virtual domains, available soccer data, including event and tracking data, and diverse orientation and decisionmaking models for both real-world and virtual soccer matches are comprehensively reviewed. Finally, some promising directions in this interdisciplinary area are pointed out. It is claimed that paradigms for both professional sports analytics and AI research could be combined. Moreover, it is quite promising to bridge the gap between the real and virtual domains for soccer match analysis and decision-making.展开更多
To solve the problem that multiple missiles should simultaneously attack unmeasurable maneuvering targets,a guidance law with temporal consistency constraint based on the super-twisting observer is proposed.Firstly,th...To solve the problem that multiple missiles should simultaneously attack unmeasurable maneuvering targets,a guidance law with temporal consistency constraint based on the super-twisting observer is proposed.Firstly,the relative motion equations between multiple missiles and targets are established,and the topological model among multiple agents is considered.Secondly,based on the temporal consistency constraint,a cooperative guidance law for simultaneous arrival with finite-time convergence is derived.Finally,the unknown target maneuver-ing is regarded as bounded interference.Based on the second-order sliding mode theory,a super-twisting sliding mode observer is devised to observe and track the bounded interfer-ence,and the stability of the observer is proved.Compared with the existing research,this approach only needs to obtain the sliding mode variable which simplifies the design process.The simulation results show that the designed cooperative guidance law for maneuvering targets achieves the expected effect.It ensures successful cooperative attacks,even when confronted with strong maneuvering targets.展开更多
Humans are experiencing the inclusion of artificial agents in their lives,such as unmanned vehicles,service robots,voice assistants,and intelligent medical care.If the artificial agents cannot align with social values...Humans are experiencing the inclusion of artificial agents in their lives,such as unmanned vehicles,service robots,voice assistants,and intelligent medical care.If the artificial agents cannot align with social values or make ethical decisions,they may not meet the expectations of humans.Traditionally,an ethical decision-making framework is constructed by rule-based or statistical approaches.In this paper,we propose an ethical decision-making framework based on incremental ILP(Inductive Logic Programming),which can overcome the brittleness of rule-based approaches and little interpretability of statistical approaches.As the current incremental ILP makes it difficult to solve conflicts,we propose a novel ethical decision-making framework considering conflicts in this paper,which adopts our proposed incremental ILP system.The framework consists of two processes:the learning process and the deduction process.The first process records bottom clauses with their score functions and learns rules guided by the entailment and the score function.The second process obtains an ethical decision based on the rules.In an ethical scenario about chatbots for teenagers’mental health,we verify that our framework can learn ethical rules and make ethical decisions.Besides,we extract incremental ILP from the framework and compare it with the state-of-the-art ILP systems based on ASP(Answer Set Programming)focusing on conflict resolution.The results of comparisons show that our proposed system can generate better-quality rules than most other systems.展开更多
Stroke is a chronic cerebrovascular disease that carries a high risk.Stroke risk assessment is of great significance in preventing,reversing and reducing the spread and the health hazards caused by stroke.Aiming to ob...Stroke is a chronic cerebrovascular disease that carries a high risk.Stroke risk assessment is of great significance in preventing,reversing and reducing the spread and the health hazards caused by stroke.Aiming to objectively predict and identify strokes,this paper proposes a new stroke risk assessment decision-making model named Logistic-AdaBoost(Logistic-AB)based on machine learning.First,the categorical boosting(CatBoost)method is used to perform feature selection for all features of stroke,and 8 main features are selected to form a new index evaluation system to predict the risk of stroke.Second,the borderline synthetic minority oversampling technique(SMOTE)algorithm is applied to transform the unbalanced stroke dataset into a balanced dataset.Finally,the stroke risk assessment decision-makingmodel Logistic-AB is constructed,and the overall prediction performance of this new model is evaluated by comparing it with ten other similar models.The comparison results show that the new model proposed in this paper performs better than the two single algorithms(logistic regression and AdaBoost)on the four indicators of recall,precision,F1 score,and accuracy,and the overall performance of the proposed model is better than that of common machine learning algorithms.The Logistic-AB model presented in this paper can more accurately predict patients’stroke risk.展开更多
Spherical q-linearDiophantine fuzzy sets(Sq-LDFSs)provedmore effective for handling uncertainty and vagueness in multi-criteria decision-making(MADM).It does not only cover the data in two variable parameters but is a...Spherical q-linearDiophantine fuzzy sets(Sq-LDFSs)provedmore effective for handling uncertainty and vagueness in multi-criteria decision-making(MADM).It does not only cover the data in two variable parameters but is also beneficial for three parametric data.By Pythagorean fuzzy sets,the difference is calculated only between two parameters(membership and non-membership).According to human thoughts,fuzzy data can be found in three parameters(membership uncertainty,and non-membership).So,to make a compromise decision,comparing Sq-LDFSs is essential.Existing measures of different fuzzy sets do,however,can have several flaws that can lead to counterintuitive results.For instance,they treat any increase or decrease in the membership degree as the same as the non-membership degree because the uncertainty does not change,even though each parameter has a different implication.In the Sq-LDFSs comparison,this research develops the differentialmeasure(DFM).Themain goal of the DFM is to cover the unfair arguments that come from treating different types of FSs opposing criteria equally.Due to their relative positions in the attribute space and the similarity of their membership and non-membership degrees,two Sq-LDFSs formthis preference connectionwhen the uncertainty remains same in both sets.According to the degree of superiority or inferiority,two Sq-LDFSs are shown as identical,equivalent,superior,or inferior over one another.The suggested DFM’s fundamental characteristics are provided.Based on the newly developed DFM,a unique approach tomultiple criterion group decision-making is offered.Our suggestedmethod verifies the novel way of calculating the expert weights for Sq-LDFSS as in PFSs.Our proposed technique in three parameters is applied to evaluate solid-state drives and choose the optimum photovoltaic cell in two applications by taking uncertainty parameter zero.The method’s applicability and validity shown by the findings are contrasted with those obtained using various other existing approaches.To assess its stability and usefulness,a sensitivity analysis is done.展开更多
Tourism is a popular activity that allows individuals to escape their daily routines and explore new destinations for various reasons,including leisure,pleasure,or business.A recent study has proposed a unique mathema...Tourism is a popular activity that allows individuals to escape their daily routines and explore new destinations for various reasons,including leisure,pleasure,or business.A recent study has proposed a unique mathematical concept called a q−Rung orthopair fuzzy hypersoft set(q−ROFHS)to enhance the formal representation of human thought processes and evaluate tourism carrying capacity.This approach can capture the imprecision and ambiguity often present in human perception.With the advanced mathematical tools in this field,the study has also incorporated the Einstein aggregation operator and score function into the q−ROFHS values to supportmultiattribute decision-making algorithms.By implementing this technique,effective plans can be developed for social and economic development while avoiding detrimental effects such as overcrowding or environmental damage caused by tourism.A case study of selected tourism carrying capacity will demonstrate the proposed methodology.展开更多
An algorithm to track multiple sharply maneuvering targets without prior knowledge about new target birth is proposed. These targets are capable of achieving sharp maneuvers within a short period of time, such as dron...An algorithm to track multiple sharply maneuvering targets without prior knowledge about new target birth is proposed. These targets are capable of achieving sharp maneuvers within a short period of time, such as drones and agile missiles.The probability hypothesis density (PHD) filter, which propagates only the first-order statistical moment of the full target posterior, has been shown to be a computationally efficient solution to multitarget tracking problems. However, the standard PHD filter operates on the single dynamic model and requires prior information about target birth distribution, which leads to many limitations in terms of practical applications. In this paper,we introduce a nonzero mean, white noise turn rate dynamic model and generalize jump Markov systems to multitarget case to accommodate sharply maneuvering dynamics. Moreover, to adaptively estimate newborn targets’information, a measurement-driven method based on the recursive random sampling consensus (RANSAC) algorithm is proposed. Simulation results demonstrate that the proposed method achieves significant improvement in tracking multiple sharply maneuvering targets with adaptive birth estimation.展开更多
Breastfeeding practices are influenced by multifactorial determinants including individual characteristics,external support systems,and media influences.This commentary emphasizes such complex factors influencing brea...Breastfeeding practices are influenced by multifactorial determinants including individual characteristics,external support systems,and media influences.This commentary emphasizes such complex factors influencing breastfeeding practices.Potential methodological limitations and the need for diverse sampling in studying breastfeeding practices are highlighted.Further research must explore the interplay between social influences,cultural norms,government policies,and individual factors in shaping maternal breastfeeding decisions.展开更多
Patients and physicians understand the importance of self-care following spinal cord injury (SCI), yet many individuals with SCI do not adhere to recommended self-care activities despite logistical supports. Neurobeha...Patients and physicians understand the importance of self-care following spinal cord injury (SCI), yet many individuals with SCI do not adhere to recommended self-care activities despite logistical supports. Neurobehavioral determinants of SCI self-care behavior, such as impulsivity, are not widely studied, yet understanding them could inform efforts to improve SCI self-care. We explored associations between impulsivity and self-care in an observational study of 35 US adults age 18 - 50 who had traumatic SCI with paraplegia at least six months before assessment. The primary outcome measure was self-reported self-care. In LASSO regression models that included all neurobehavioral measures and demographics as predictors of self-care, dispositional measures of greater impulsivity (negative urgency, lack of premeditation, lack of perseverance), and reduced mindfulness were associated with reduced self-care. Outcome (magnitude) sensitivity, a latent decision-making parameter derived from computationally modeling successive choices in a gambling task, was also associated with self-care behavior. These results are preliminary;more research is needed to demonstrate the utility of these findings in clinical settings. Information about associations between impulsivity and poor self-care in people with SCI could guide the development of interventions to improve SCI self-care and help patients with elevated risks related to self-care and secondary health conditions.展开更多
To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select...To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select the appropriate language phrase set according to their own situation,give the preference information of the weight of each key indicator,and then transform the multi-granularity language information through consistency.On this basis,the sequential optimization technology of the approximately ideal scheme is introduced to obtain the weight coefficient of each key indicator.Subsequently,the weighted average operator is used to aggregate the preference information of each alternative scheme with the relative importance of decision-makers and the weight of key indicators in sequence,and the comprehensive evaluation value of each scheme is obtained to determine the optimal scheme.Lastly,the effectiveness and practicability of the method are verified by taking the earthwork collapse accident in the construction of a reservoir as an example.展开更多
Bayesian inference model is an optimal processing of incomplete information that, more than other models, better captures the way in which any decision-maker learns and updates his degree of rational beliefs about pos...Bayesian inference model is an optimal processing of incomplete information that, more than other models, better captures the way in which any decision-maker learns and updates his degree of rational beliefs about possible states of nature, in order to make a better judgment while taking new evidence into account. Such a scientific model proposed for the general theory of decision-making, like all others in general, whether in statistics, economics, operations research, A.I., data science or applied mathematics, regardless of whether they are time-dependent, have in common a theoretical basis that is axiomatized by relying on related concepts of a universe of possibles, especially the so-called universe (or the world), the state of nature (or the state of the world), when formulated explicitly. The issue of where to stand as an observer or a decision-maker to reframe such a universe of possibles together with a partition structure of knowledge (i.e. semantic formalisms), including a copy of itself as it was initially while generalizing it, is not addressed. Memory being the substratum, whether human or artificial, wherein everything stands, to date, even the theoretical possibility of such an operation of self-inclusion is prohibited by pure mathematics. We make this blind spot come to light through a counter-example (namely Archimedes’ Eureka experiment) and explore novel theoretical foundations, fitting better with a quantum form than with fuzzy modeling, to deal with more than a reference universe of possibles. This could open up a new path of investigation for the general theory of decision-making, as well as for Artificial Intelligence, often considered as the science of the imitation of human abilities, while being also the science of knowledge representation and the science of concept formation and reasoning.展开更多
In the developmental dilemma of artificial intelligence(AI)-assisted judicial decision-making,the technical architecture of AI determines its inherent lack of transparency and interpretability,which is challenging to ...In the developmental dilemma of artificial intelligence(AI)-assisted judicial decision-making,the technical architecture of AI determines its inherent lack of transparency and interpretability,which is challenging to fundamentally improve.This can be considered a true challenge in the realm of AI-assisted judicial decision-making.By examining the court’s acceptance,integration,and trade-offs of AI technology embedded in the judicial field,the exploration of potential conflicts,interactions,and even mutual shaping between the two will not only reshape their conceptual connotations and intellectual boundaries but also strengthen the cognition and re-interpretation of the basic principles and core values of the judicial trial system.展开更多
Spacecraft orbit evasion is an effective method to ensure space safety. In the spacecraft’s orbital plane, the space non-cooperate target with autonomous approaching to the spacecraft may have a dangerous rendezvous....Spacecraft orbit evasion is an effective method to ensure space safety. In the spacecraft’s orbital plane, the space non-cooperate target with autonomous approaching to the spacecraft may have a dangerous rendezvous. To deal with this problem, an optimal maneuvering strategy based on the relative navigation observability degree is proposed with angles-only measurements. A maneuver evasion relative navigation model in the spacecraft’s orbital plane is constructed and the observability measurement criteria with process noise and measurement noise are defined based on the posterior Cramer-Rao lower bound. Further, the optimal maneuver evasion strategy in spacecraft’s orbital plane based on the observability is proposed. The strategy provides a new idea for spacecraft to evade safety threats autonomously. Compared with the spacecraft evasion problem based on the absolute navigation, more accurate evasion results can be obtained. The simulation indicates that this optimal strategy can weaken the system’s observability and reduce the state estimation accuracy of the non-cooperative target, making it impossible for the non-cooperative target to accurately approach the spacecraft.展开更多
基金supported by the Key Research and Development Program of Shaanxi (2022GXLH-02-09)the Aeronautical Science Foundation of China (20200051053001)the Natural Science Basic Research Program of Shaanxi (2020JM-147)。
文摘Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decisionmaking policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods.Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.
基金supported by the Natural Science Basic Research Program of Shaanxi(Program No.2022JQ-593)。
文摘In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation evaluation function with strong guidance,then trains the Long Short-Term Memory(LSTM)under the framework of Deep Q Network(DQN)for air combat maneuvering decision-making.Considering the continuity between adjacent situations,the method takes multiple consecutive situations as one input of the neural network.To reflect the difference between adjacent situations,the method takes the difference of situation evaluation value as the reward of reinforcement learning.In different scenarios,the algorithm proposed in this paper is compared with the algorithm based on the Fully Neural Network(FNN)and the algorithm based on statistical principles respectively.The results show that,compared with the FNN algorithm,the algorithm proposed in this paper is more accurate and forwardlooking.Compared with the algorithm based on the statistical principles,the decision-making of the algorithm proposed in this paper is more efficient and its real-time performance is better.
基金acknowledge National Natural Science Foundation of China(Grant No.61573285,No.62003267)Open Fund of Key Laboratory of Data Link Technology of China Electronics Technology Group Corporation(Grant No.CLDL-20182101)Natural Science Foundation of Shaanxi Province(Grant No.2020JQ220)to provide fund for conducting experiments.
文摘Aiming at intelligent decision-making of unmanned aerial vehicle(UAV)based on situation information in air combat,a novelmaneuvering decision method based on deep reinforcement learning is proposed in this paper.The autonomous maneuvering model ofUAV is established byMarkovDecision Process.The Twin DelayedDeep Deterministic Policy Gradient(TD3)algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm in deep reinforcement learning are used to train the model,and the experimental results of the two algorithms are analyzed and compared.The simulation experiment results show that compared with the DDPG algorithm,the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems.The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position,speed,and relative azimuth,adjust their actions to approach,and successfully strike the enemy,providing a new method for UAVs to make intelligent maneuvering decisions during air combat.
文摘The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-making among game players in simulated confrontation scenarios within a random interference environment.It considers the possible risks that random disturbances may pose to the autonomous decision-making of game players,as well as the impact of participants’manipulative behaviors on the state changes of the players.A nonlinear mathematical model is established to describe the strategy decision-making process of the participants in this scenario.Subsequently,the strategy selection interaction relationship,strategy evolution stability,and dynamic decision-making process of the game players are investigated and verified by simulation experiments.The results show that maneuver-related parameters and random environmental interference factors have different effects on the selection and evolutionary speed of the agent’s strategies.Especially in a highly uncertain environment,even small information asymmetry or miscalculation may have a significant impact on decision-making.This also confirms the feasibility and effectiveness of the method proposed in the paper,which can better explain the behavioral decision-making process of the agent in the interaction process.This study provides feasibility analysis ideas and theoretical references for improving multi-agent interactive decision-making and the interpretability of the game system model.
基金supported in part by the National Key Laboratory of Air-based Information Perception and Fusion and the Aeronautical Science Foundation of China (Grant No. 20220001068001)National Natural Science Foundation of China (Grant No.61673327)+1 种基金Natural Science Basic Research Plan in Shaanxi Province,China (Grant No. 2023-JC-QN-0733)China IndustryUniversity-Research Innovation Foundation (Grant No. 2022IT188)。
文摘Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.
基金the financial support of the National Key Research and Development Program of China(2020AAA0108100)the Shanghai Municipal Science and Technology Major Project(2021SHZDZX0100)the Shanghai Gaofeng and Gaoyuan Project for University Academic Program Development for funding。
文摘Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning framework for autonomous driving based on a hybrid data-and model-driven method.First,a data-driven decision-making module based on deep reinforcement learning(DRL)is developed to pursue a rational driving performance as much as possible.Then,model predictive control(MPC)is employed to execute both longitudinal and lateral motion planning tasks.Multiple constraints are defined according to the vehicle’s physical limit to meet the driving task requirements.Finally,two principles of safety and rationality for the self-evolution of autonomous driving are proposed.A motion envelope is established and embedded into a rational exploration and exploitation scheme,which filters out unreasonable experiences by masking unsafe actions so as to collect high-quality training data for the DRL agent.Experiments with a high-fidelity vehicle model and MATLAB/Simulink co-simulation environment are conducted,and the results show that the proposed online-evolution framework is able to generate safer,more rational,and more efficient driving action in a real-world environment.
基金supported in part by the Start-Up Grant-Nanyang Assistant Professorship Grant of Nanyang Technological Universitythe Agency for Science,Technology and Research(A*STAR)under Advanced Manufacturing and Engineering(AME)Young Individual Research under Grant(A2084c0156)+2 种基金the MTC Individual Research Grant(M22K2c0079)the ANR-NRF Joint Grant(NRF2021-NRF-ANR003 HM Science)the Ministry of Education(MOE)under the Tier 2 Grant(MOE-T2EP50222-0002)。
文摘While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present a novel robust reinforcement learning approach with safety guarantees to attain trustworthy decision-making for autonomous vehicles.The proposed technique ensures decision trustworthiness in terms of policy robustness and collision safety.Specifically,an adversary model is learned online to simulate the worst-case uncertainty by approximating the optimal adversarial perturbations on the observed states and environmental dynamics.In addition,an adversarial robust actor-critic algorithm is developed to enable the agent to learn robust policies against perturbations in observations and dynamics.Moreover,we devise a safety mask to guarantee the collision safety of the autonomous driving agent during both the training and testing processes using an interpretable knowledge model known as the Responsibility-Sensitive Safety Model.Finally,the proposed approach is evaluated through both simulations and experiments.These results indicate that the autonomous driving agent can make trustworthy decisions and drastically reduce the number of collisions through robust safety policies.
基金supported by the National Key Research,Development Program of China (2020AAA0103404)the Beijing Nova Program (20220484077)the National Natural Science Foundation of China (62073323)。
文摘Due to ever-growing soccer data collection approaches and progressing artificial intelligence(AI) methods, soccer analysis, evaluation, and decision-making have received increasing interest from not only the professional sports analytics realm but also the academic AI research community. AI brings gamechanging approaches for soccer analytics where soccer has been a typical benchmark for AI research. The combination has been an emerging topic. In this paper, soccer match analytics are taken as a complete observation-orientation-decision-action(OODA) loop.In addition, as in AI frameworks such as that for reinforcement learning, interacting with a virtual environment enables an evolving model. Therefore, both soccer analytics in the real world and virtual domains are discussed. With the intersection of the OODA loop and the real-virtual domains, available soccer data, including event and tracking data, and diverse orientation and decisionmaking models for both real-world and virtual soccer matches are comprehensively reviewed. Finally, some promising directions in this interdisciplinary area are pointed out. It is claimed that paradigms for both professional sports analytics and AI research could be combined. Moreover, it is quite promising to bridge the gap between the real and virtual domains for soccer match analysis and decision-making.
基金supported by the Funds for the Central Universities。
文摘To solve the problem that multiple missiles should simultaneously attack unmeasurable maneuvering targets,a guidance law with temporal consistency constraint based on the super-twisting observer is proposed.Firstly,the relative motion equations between multiple missiles and targets are established,and the topological model among multiple agents is considered.Secondly,based on the temporal consistency constraint,a cooperative guidance law for simultaneous arrival with finite-time convergence is derived.Finally,the unknown target maneuver-ing is regarded as bounded interference.Based on the second-order sliding mode theory,a super-twisting sliding mode observer is devised to observe and track the bounded interfer-ence,and the stability of the observer is proved.Compared with the existing research,this approach only needs to obtain the sliding mode variable which simplifies the design process.The simulation results show that the designed cooperative guidance law for maneuvering targets achieves the expected effect.It ensures successful cooperative attacks,even when confronted with strong maneuvering targets.
基金This work was funded by the National Natural Science Foundation of China Nos.U22A2099,61966009,62006057the Graduate Innovation Program No.YCSW2022286.
文摘Humans are experiencing the inclusion of artificial agents in their lives,such as unmanned vehicles,service robots,voice assistants,and intelligent medical care.If the artificial agents cannot align with social values or make ethical decisions,they may not meet the expectations of humans.Traditionally,an ethical decision-making framework is constructed by rule-based or statistical approaches.In this paper,we propose an ethical decision-making framework based on incremental ILP(Inductive Logic Programming),which can overcome the brittleness of rule-based approaches and little interpretability of statistical approaches.As the current incremental ILP makes it difficult to solve conflicts,we propose a novel ethical decision-making framework considering conflicts in this paper,which adopts our proposed incremental ILP system.The framework consists of two processes:the learning process and the deduction process.The first process records bottom clauses with their score functions and learns rules guided by the entailment and the score function.The second process obtains an ethical decision based on the rules.In an ethical scenario about chatbots for teenagers’mental health,we verify that our framework can learn ethical rules and make ethical decisions.Besides,we extract incremental ILP from the framework and compare it with the state-of-the-art ILP systems based on ASP(Answer Set Programming)focusing on conflict resolution.The results of comparisons show that our proposed system can generate better-quality rules than most other systems.
基金supported by the National Natural Science Foundation of China (No.72071150).
文摘Stroke is a chronic cerebrovascular disease that carries a high risk.Stroke risk assessment is of great significance in preventing,reversing and reducing the spread and the health hazards caused by stroke.Aiming to objectively predict and identify strokes,this paper proposes a new stroke risk assessment decision-making model named Logistic-AdaBoost(Logistic-AB)based on machine learning.First,the categorical boosting(CatBoost)method is used to perform feature selection for all features of stroke,and 8 main features are selected to form a new index evaluation system to predict the risk of stroke.Second,the borderline synthetic minority oversampling technique(SMOTE)algorithm is applied to transform the unbalanced stroke dataset into a balanced dataset.Finally,the stroke risk assessment decision-makingmodel Logistic-AB is constructed,and the overall prediction performance of this new model is evaluated by comparing it with ten other similar models.The comparison results show that the new model proposed in this paper performs better than the two single algorithms(logistic regression and AdaBoost)on the four indicators of recall,precision,F1 score,and accuracy,and the overall performance of the proposed model is better than that of common machine learning algorithms.The Logistic-AB model presented in this paper can more accurately predict patients’stroke risk.
基金the Deanship of Scientific Research at Umm Al-Qura University(Grant Code:22UQU4310396DSR65).
文摘Spherical q-linearDiophantine fuzzy sets(Sq-LDFSs)provedmore effective for handling uncertainty and vagueness in multi-criteria decision-making(MADM).It does not only cover the data in two variable parameters but is also beneficial for three parametric data.By Pythagorean fuzzy sets,the difference is calculated only between two parameters(membership and non-membership).According to human thoughts,fuzzy data can be found in three parameters(membership uncertainty,and non-membership).So,to make a compromise decision,comparing Sq-LDFSs is essential.Existing measures of different fuzzy sets do,however,can have several flaws that can lead to counterintuitive results.For instance,they treat any increase or decrease in the membership degree as the same as the non-membership degree because the uncertainty does not change,even though each parameter has a different implication.In the Sq-LDFSs comparison,this research develops the differentialmeasure(DFM).Themain goal of the DFM is to cover the unfair arguments that come from treating different types of FSs opposing criteria equally.Due to their relative positions in the attribute space and the similarity of their membership and non-membership degrees,two Sq-LDFSs formthis preference connectionwhen the uncertainty remains same in both sets.According to the degree of superiority or inferiority,two Sq-LDFSs are shown as identical,equivalent,superior,or inferior over one another.The suggested DFM’s fundamental characteristics are provided.Based on the newly developed DFM,a unique approach tomultiple criterion group decision-making is offered.Our suggestedmethod verifies the novel way of calculating the expert weights for Sq-LDFSS as in PFSs.Our proposed technique in three parameters is applied to evaluate solid-state drives and choose the optimum photovoltaic cell in two applications by taking uncertainty parameter zero.The method’s applicability and validity shown by the findings are contrasted with those obtained using various other existing approaches.To assess its stability and usefulness,a sensitivity analysis is done.
基金the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2021R1A4A1031509).
文摘Tourism is a popular activity that allows individuals to escape their daily routines and explore new destinations for various reasons,including leisure,pleasure,or business.A recent study has proposed a unique mathematical concept called a q−Rung orthopair fuzzy hypersoft set(q−ROFHS)to enhance the formal representation of human thought processes and evaluate tourism carrying capacity.This approach can capture the imprecision and ambiguity often present in human perception.With the advanced mathematical tools in this field,the study has also incorporated the Einstein aggregation operator and score function into the q−ROFHS values to supportmultiattribute decision-making algorithms.By implementing this technique,effective plans can be developed for social and economic development while avoiding detrimental effects such as overcrowding or environmental damage caused by tourism.A case study of selected tourism carrying capacity will demonstrate the proposed methodology.
基金supported by the National Natural Science Foundation of China (61773142)。
文摘An algorithm to track multiple sharply maneuvering targets without prior knowledge about new target birth is proposed. These targets are capable of achieving sharp maneuvers within a short period of time, such as drones and agile missiles.The probability hypothesis density (PHD) filter, which propagates only the first-order statistical moment of the full target posterior, has been shown to be a computationally efficient solution to multitarget tracking problems. However, the standard PHD filter operates on the single dynamic model and requires prior information about target birth distribution, which leads to many limitations in terms of practical applications. In this paper,we introduce a nonzero mean, white noise turn rate dynamic model and generalize jump Markov systems to multitarget case to accommodate sharply maneuvering dynamics. Moreover, to adaptively estimate newborn targets’information, a measurement-driven method based on the recursive random sampling consensus (RANSAC) algorithm is proposed. Simulation results demonstrate that the proposed method achieves significant improvement in tracking multiple sharply maneuvering targets with adaptive birth estimation.
文摘Breastfeeding practices are influenced by multifactorial determinants including individual characteristics,external support systems,and media influences.This commentary emphasizes such complex factors influencing breastfeeding practices.Potential methodological limitations and the need for diverse sampling in studying breastfeeding practices are highlighted.Further research must explore the interplay between social influences,cultural norms,government policies,and individual factors in shaping maternal breastfeeding decisions.
文摘Patients and physicians understand the importance of self-care following spinal cord injury (SCI), yet many individuals with SCI do not adhere to recommended self-care activities despite logistical supports. Neurobehavioral determinants of SCI self-care behavior, such as impulsivity, are not widely studied, yet understanding them could inform efforts to improve SCI self-care. We explored associations between impulsivity and self-care in an observational study of 35 US adults age 18 - 50 who had traumatic SCI with paraplegia at least six months before assessment. The primary outcome measure was self-reported self-care. In LASSO regression models that included all neurobehavioral measures and demographics as predictors of self-care, dispositional measures of greater impulsivity (negative urgency, lack of premeditation, lack of perseverance), and reduced mindfulness were associated with reduced self-care. Outcome (magnitude) sensitivity, a latent decision-making parameter derived from computationally modeling successive choices in a gambling task, was also associated with self-care behavior. These results are preliminary;more research is needed to demonstrate the utility of these findings in clinical settings. Information about associations between impulsivity and poor self-care in people with SCI could guide the development of interventions to improve SCI self-care and help patients with elevated risks related to self-care and secondary health conditions.
文摘To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select the appropriate language phrase set according to their own situation,give the preference information of the weight of each key indicator,and then transform the multi-granularity language information through consistency.On this basis,the sequential optimization technology of the approximately ideal scheme is introduced to obtain the weight coefficient of each key indicator.Subsequently,the weighted average operator is used to aggregate the preference information of each alternative scheme with the relative importance of decision-makers and the weight of key indicators in sequence,and the comprehensive evaluation value of each scheme is obtained to determine the optimal scheme.Lastly,the effectiveness and practicability of the method are verified by taking the earthwork collapse accident in the construction of a reservoir as an example.
文摘Bayesian inference model is an optimal processing of incomplete information that, more than other models, better captures the way in which any decision-maker learns and updates his degree of rational beliefs about possible states of nature, in order to make a better judgment while taking new evidence into account. Such a scientific model proposed for the general theory of decision-making, like all others in general, whether in statistics, economics, operations research, A.I., data science or applied mathematics, regardless of whether they are time-dependent, have in common a theoretical basis that is axiomatized by relying on related concepts of a universe of possibles, especially the so-called universe (or the world), the state of nature (or the state of the world), when formulated explicitly. The issue of where to stand as an observer or a decision-maker to reframe such a universe of possibles together with a partition structure of knowledge (i.e. semantic formalisms), including a copy of itself as it was initially while generalizing it, is not addressed. Memory being the substratum, whether human or artificial, wherein everything stands, to date, even the theoretical possibility of such an operation of self-inclusion is prohibited by pure mathematics. We make this blind spot come to light through a counter-example (namely Archimedes’ Eureka experiment) and explore novel theoretical foundations, fitting better with a quantum form than with fuzzy modeling, to deal with more than a reference universe of possibles. This could open up a new path of investigation for the general theory of decision-making, as well as for Artificial Intelligence, often considered as the science of the imitation of human abilities, while being also the science of knowledge representation and the science of concept formation and reasoning.
文摘In the developmental dilemma of artificial intelligence(AI)-assisted judicial decision-making,the technical architecture of AI determines its inherent lack of transparency and interpretability,which is challenging to fundamentally improve.This can be considered a true challenge in the realm of AI-assisted judicial decision-making.By examining the court’s acceptance,integration,and trade-offs of AI technology embedded in the judicial field,the exploration of potential conflicts,interactions,and even mutual shaping between the two will not only reshape their conceptual connotations and intellectual boundaries but also strengthen the cognition and re-interpretation of the basic principles and core values of the judicial trial system.
基金supported by the National Key R&D Program of China (2020YFA0713502)the Special Fund Project for Guiding Local Scientific and Technological Development (2020ZYT003)+1 种基金the National Natural Science Foundation of China (U20B2055,61773021,61903086)the Natural Science Foundation of Hunan Province (2019JJ20018,2020JJ4280)。
文摘Spacecraft orbit evasion is an effective method to ensure space safety. In the spacecraft’s orbital plane, the space non-cooperate target with autonomous approaching to the spacecraft may have a dangerous rendezvous. To deal with this problem, an optimal maneuvering strategy based on the relative navigation observability degree is proposed with angles-only measurements. A maneuver evasion relative navigation model in the spacecraft’s orbital plane is constructed and the observability measurement criteria with process noise and measurement noise are defined based on the posterior Cramer-Rao lower bound. Further, the optimal maneuver evasion strategy in spacecraft’s orbital plane based on the observability is proposed. The strategy provides a new idea for spacecraft to evade safety threats autonomously. Compared with the spacecraft evasion problem based on the absolute navigation, more accurate evasion results can be obtained. The simulation indicates that this optimal strategy can weaken the system’s observability and reduce the state estimation accuracy of the non-cooperative target, making it impossible for the non-cooperative target to accurately approach the spacecraft.