Ultra-high performance cement-based composites (UHPCC) is promising in construction of concrete structures that suffer impact and explosive loads.In this study,a reference UHPCC mixture with no fiber reinforcement and...Ultra-high performance cement-based composites (UHPCC) is promising in construction of concrete structures that suffer impact and explosive loads.In this study,a reference UHPCC mixture with no fiber reinforcement and four mixtures with a single type of fiber reinforcement or hybrid fiber reinforcements of straight smooth and end hook type of steel fibers were prepared.Split Hopkinson pressure bar (SHPB) was performed to investigate the dynamic compression behavior of UHPCC and X-CT test and 3D reconstruction technology were used to indicate the failure process of UHPCC under impact loading.Results show that UHPCC with 1% straight smooth fiber and 2% end hook fiber reinforcements demonstrated the best static and dynamic mechanical properties.When the hybrid steel fiber reinforcements are added in the concrete,it may need more impact energy to break the matrix and to pull out the fiber reinforcements,thus,the mixture with hybrid steel fiber reinforcements demonstrates excellent dynamic compressive performance.展开更多
Copper matrix composites reinforced by in situ-formed hybrid titanium boride whiskers(TiB_(w))and titanium diboride particles(TiB_(2p))were fabricated by powder metallurgy.Microstructural observations showed competiti...Copper matrix composites reinforced by in situ-formed hybrid titanium boride whiskers(TiB_(w))and titanium diboride particles(TiB_(2p))were fabricated by powder metallurgy.Microstructural observations showed competitive precipitation behavior between TiB_(w) and TiB_(2p),where the relative contents of the two reinforcements varied with sintering temperature.Based on thermodynamic and kinetic assessments,the precipitation mechanisms of the hybrid reinforcements were discussed,and the formation of both TiB_(w) and TiB_(2p) from the local melting zone was thermodynamically favored.The precipitation kinetics were mainly controlled by a solid-state diffusion of B atoms.By forming a compact compound layer,in situ reactions were divided into two stages,where Zener growth and Dybkov growth prevailed,respectively.Accordingly,the competitive precipitation behavior was attributed to the transition of the growth model during the reaction process.展开更多
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers...Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.展开更多
Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning technique...Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning techniques have emerged as promising tools in stroke medicine,enabling efficient analysis of large-scale datasets and facilitating personalized and precision medicine approaches.This abstract provides a comprehensive overview of machine learning’s applications,challenges,and future directions in stroke medicine.Recently introduced machine learning algorithms have been extensively employed in all the fields of stroke medicine.Machine learning models have demonstrated remarkable accuracy in imaging analysis,diagnosing stroke subtypes,risk stratifications,guiding medical treatment,and predicting patient prognosis.Despite the tremendous potential of machine learning in stroke medicine,several challenges must be addressed.These include the need for standardized and interoperable data collection,robust model validation and generalization,and the ethical considerations surrounding privacy and bias.In addition,integrating machine learning models into clinical workflows and establishing regulatory frameworks are critical for ensuring their widespread adoption and impact in routine stroke care.Machine learning promises to revolutionize stroke medicine by enabling precise diagnosis,tailored treatment selection,and improved prognostication.Continued research and collaboration among clinicians,researchers,and technologists are essential for overcoming challenges and realizing the full potential of machine learning in stroke care,ultimately leading to enhanced patient outcomes and quality of life.This review aims to summarize all the current implications of machine learning in stroke diagnosis,treatment,and prognostic evaluation.At the same time,another purpose of this paper is to explore all the future perspectives these techniques can provide in combating this disabling disease.展开更多
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
The forming of textile reinforcements is an important stage in the manufacturing of textile composite parts with Liquid Composite Molding process.Fiber orientations and part geometry obtained from this stage have sign...The forming of textile reinforcements is an important stage in the manufacturing of textile composite parts with Liquid Composite Molding process.Fiber orientations and part geometry obtained from this stage have significant impact on the subsequent resin injection and final mechanical properties of composite part.Numerical simulation of textile reinforcement forming is in strong demand as it can greatly reduce the time and cost in the determination of the optimized processing parameters,which is the foundation of the low-cost application of composite materials.This review presents the state of the art of forming modeling methods for textile reinforcement and the corresponding experimental characterization methods developed in this field.The microscopic,mesoscopic and macroscopic models are discussed.Studies concerning the simulation of wrinkling are also presented since it is the most common defect occurred in the textile reinforcement forming.Finally,challenges and recommendations on the future research directions for textile reinforcement modeling and experimental characterization are provided.展开更多
Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net...Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.展开更多
While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present...While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present a novel robust reinforcement learning approach with safety guarantees to attain trustworthy decision-making for autonomous vehicles.The proposed technique ensures decision trustworthiness in terms of policy robustness and collision safety.Specifically,an adversary model is learned online to simulate the worst-case uncertainty by approximating the optimal adversarial perturbations on the observed states and environmental dynamics.In addition,an adversarial robust actor-critic algorithm is developed to enable the agent to learn robust policies against perturbations in observations and dynamics.Moreover,we devise a safety mask to guarantee the collision safety of the autonomous driving agent during both the training and testing processes using an interpretable knowledge model known as the Responsibility-Sensitive Safety Model.Finally,the proposed approach is evaluated through both simulations and experiments.These results indicate that the autonomous driving agent can make trustworthy decisions and drastically reduce the number of collisions through robust safety policies.展开更多
The stability of the ancient flood control levees is mainly influenced by water level fluctuations, groundwater concentration and rainfalls. This paper takes the Lanxi ancient levee as a research object to study the e...The stability of the ancient flood control levees is mainly influenced by water level fluctuations, groundwater concentration and rainfalls. This paper takes the Lanxi ancient levee as a research object to study the evolution laws of its seepage, displacement and stability before and after reinforcement with the upside-down hanging wells and grouting curtain through numerical simulation methods combined with experiments and observations. The study results indicate that the filled soil is less affected by water level fluctuations and groundwater concentration after reinforcement. A high groundwater level is detrimental to the levee's long-term stability, and the drainage issues need to be fully considered. The deformation of the reinforced levee is effectively controlled since the fill deformation is mainly borne by the upside-down hanging wells. The safety factors of the levee before reinforcement vary significantly with the water level. The minimum value of the safety factors is 0.886 during the water level decreasing period, indicating a very high risk of the instability. While it reached 1.478 after reinforcement, the stability of the ancient levee is improved by a large margin.展开更多
Most researches associated with target encircling control are focused on moving along a circular orbit under an ideal environment free from external disturbances.However,elliptical encirclement with a time-varying obs...Most researches associated with target encircling control are focused on moving along a circular orbit under an ideal environment free from external disturbances.However,elliptical encirclement with a time-varying observation radius,may permit a more flexible and high-efficacy enclosing solution,whilst the non-orthogonal property between axial and tangential speed components,non-ignorable environmental perturbations,and strict assignment requirements empower elliptical encircling control to be more challenging,and the relevant investigations are still open.Following this line,an appointed-time elliptical encircling control rule capable of reinforcing circumnavigation performances is developed to enable Unmanned Aerial Vehicles(UAVs)to move along a specified elliptical path within a predetermined reaching time.The remarkable merits of the designed strategy are that the relative distance controlling error can be guaranteed to evolve within specified regions with a designer-specified convergence behavior.Meanwhile,wind perturbations can be online counteracted based on an unknown system dynamics estimator(USDE)with only one regulating parameter and high computational efficiency.Lyapunov tool demonstrates that all involved error variables are ultimately limited,and simulations are implemented to confirm the usability of the suggested control algorithm.展开更多
Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning frame...Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning framework for autonomous driving based on a hybrid data-and model-driven method.First,a data-driven decision-making module based on deep reinforcement learning(DRL)is developed to pursue a rational driving performance as much as possible.Then,model predictive control(MPC)is employed to execute both longitudinal and lateral motion planning tasks.Multiple constraints are defined according to the vehicle’s physical limit to meet the driving task requirements.Finally,two principles of safety and rationality for the self-evolution of autonomous driving are proposed.A motion envelope is established and embedded into a rational exploration and exploitation scheme,which filters out unreasonable experiences by masking unsafe actions so as to collect high-quality training data for the DRL agent.Experiments with a high-fidelity vehicle model and MATLAB/Simulink co-simulation environment are conducted,and the results show that the proposed online-evolution framework is able to generate safer,more rational,and more efficient driving action in a real-world environment.展开更多
To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQu...To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQuality of Service (QoS) requirements, revealing the inadequacies of traditional routing allocation mechanismsin accommodating such extensive data flows. In response to the imperative of handling a substantial influx of datarequests promptly and alleviating the constraints of existing technologies and network congestion, we present anarchitecture forQoS routing optimizationwith in SoftwareDefinedNetwork (SDN), leveraging deep reinforcementlearning. This innovative approach entails the separation of SDN control and transmission functionalities, centralizingcontrol over data forwardingwhile integrating deep reinforcement learning for informed routing decisions. Byfactoring in considerations such as delay, bandwidth, jitter rate, and packet loss rate, we design a reward function toguide theDeepDeterministic PolicyGradient (DDPG) algorithmin learning the optimal routing strategy to furnishsuperior QoS provision. In our empirical investigations, we juxtapose the performance of Deep ReinforcementLearning (DRL) against that of Shortest Path (SP) algorithms in terms of data packet transmission delay. Theexperimental simulation results show that our proposed algorithm has significant efficacy in reducing networkdelay and improving the overall transmission efficiency, which is superior to the traditional methods.展开更多
In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforce...In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforcement learning can solve the problem of real-time updating,its prediction results are always higher than the actual results.In Botnet traffic detection,although it performs well in the training set,the accuracy rate of predicting traffic is as high as%;however,in the test set,its accuracy has declined,and it is impossible to adjust its prediction strategy on time based on new data samples.However,in the new dataset,its accuracy has declined significantly.Therefore,this paper proposes a Botnet traffic detection system based on double-layer DQN(DDQN).Two Q-values are designed to adjust the model in policy and action,respectively,to achieve real-time model updates and improve the universality and robustness of the model under different data sets.Experiments show that compared with the DQN model,when using DDQN,the Q-value is not too high,and the detectionmodel has improved the accuracy and precision of Botnet traffic.Moreover,when using Botnet data sets other than the test set,the accuracy and precision of theDDQNmodel are still higher than DQN.展开更多
To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-lea...To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-learning algorithm is proposed.First,dividing the distance between the missile and the target into multiple states to increase the quantity of state spaces.Second,a multidimensional motion space is utilized,and the search range of which changes with the distance of the projectile,to select parameters and minimize the amount of ineffective interference parameters.The interference effect is determined by detecting whether the fuze signal disappears.Finally,a weighted reward function is used to determine the reward value based on the range state,output power,and parameter quantity information of the interference form.The effectiveness of the proposed method in selecting the range of motion space parameters and designing the discrimination degree of the reward function has been verified through offline experiments involving full-range missile rendezvous.The optimal interference form for each distance state has been obtained.Compared with the single-interference decision method,the proposed decision method can effectively improve the success rate of interference.展开更多
The collapse pressure is a key parameter when RTPs are applied in harsh deep-water environments.To investigate the collapse of RTPs,numerical simulations and hydrostatic pressure tests are conducted.For the numerical ...The collapse pressure is a key parameter when RTPs are applied in harsh deep-water environments.To investigate the collapse of RTPs,numerical simulations and hydrostatic pressure tests are conducted.For the numerical simulations,the eigenvalue analysis and Riks analysis are combined,in which the Hashin failure criterion and fracture energy stiffness degradation model are used to simulate the progressive failure of composites,and the“infinite”boundary conditions are applied to eliminate the boundary effects.As for the hydrostatic pressure tests,RTP specimens were placed in a hydrostatic chamber after filled with water.It has been observed that the cross-section of the middle part collapses when it reaches the maximum pressure.The collapse pressure obtained from the numerical simulations agrees well with that in the experiment.Meanwhile,the applicability of NASA SP-8007 formula on the collapse pressure prediction was also discussed.It has a relatively greater difference because of the ignorance of the progressive failure of composites.For the parametric study,it is found that RTPs have much higher first-ply-failure pressure when the winding angles are between 50°and 70°.Besides,the effect of debonding and initial ovality,and the contribution of the liner and coating are also discussed.展开更多
This article investigates a multi-circular path-following formation control with reinforced transient profiles for nonholonomic vehicles connected by a digraph.A multi-circular formation controller endowed with the fe...This article investigates a multi-circular path-following formation control with reinforced transient profiles for nonholonomic vehicles connected by a digraph.A multi-circular formation controller endowed with the feature of spatial-temporal decoupling is devised for a group of vehicles guided by a virtual leader evolving along an implicit path,which allows for a circumnavigation on multiple circles with an anticipant angular spacing.In addition,notice that it typically imposes a stringent time constraint on time-sensitive enclosing scenarios,hence an improved prescribed performance control(IPPC)using novel tighter behavior boundaries is presented to enhance transient capabilities with an ensured appointed-time convergence free from any overshoots.The significant merits are that coordinated circumnavigation along different circles can be realized via executing geometric and dynamic assignments independently with modified transient profiles.Furthermore,all variables existing in the entire system are analyzed to be convergent.Simulation and experimental results are provided to validate the utility of suggested solution.展开更多
This paper mainly focuses on the development of a learning-based controller for a class of uncertain mechanical systems modeled by the Euler-Lagrange formulation.The considered system can depict the behavior of a larg...This paper mainly focuses on the development of a learning-based controller for a class of uncertain mechanical systems modeled by the Euler-Lagrange formulation.The considered system can depict the behavior of a large class of engineering systems,such as vehicular systems,robot manipulators and satellites.All these systems are often characterized by highly nonlinear characteristics,heavy modeling uncertainties and unknown perturbations,therefore,accurate-model-based nonlinear control approaches become unavailable.Motivated by the challenge,a reinforcement learning(RL)adaptive control methodology based on the actor-critic framework is investigated to compensate the uncertain mechanical dynamics.The approximation inaccuracies caused by RL and the exogenous unknown disturbances are circumvented via a continuous robust integral of the sign of the error(RISE)control approach.Different from a classical RISE control law,a tanh(·)function is utilized instead of a sign(·)function to acquire a more smooth control signal.The developed controller requires very little prior knowledge of the dynamic model,is robust to unknown dynamics and exogenous disturbances,and can achieve asymptotic output tracking.Eventually,co-simulations through ADAMS and MATLAB/Simulink on a three degrees-of-freedom(3-DOF)manipulator and experiments on a real-time electromechanical servo system are performed to verify the performance of the proposed approach.展开更多
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ...As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.展开更多
This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u...This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.展开更多
The Multi-access Edge Cloud(MEC) networks extend cloud computing services and capabilities to the edge of the networks. By bringing computation and storage capabilities closer to end-users and connected devices, MEC n...The Multi-access Edge Cloud(MEC) networks extend cloud computing services and capabilities to the edge of the networks. By bringing computation and storage capabilities closer to end-users and connected devices, MEC networks can support a wide range of applications. MEC networks can also leverage various types of resources, including computation resources, network resources, radio resources,and location-based resources, to provide multidimensional resources for intelligent applications in 5/6G.However, tasks generated by users often consist of multiple subtasks that require different types of resources. It is a challenging problem to offload multiresource task requests to the edge cloud aiming at maximizing benefits due to the heterogeneity of resources provided by devices. To address this issue,we mathematically model the task requests with multiple subtasks. Then, the problem of task offloading of multi-resource task requests is proved to be NP-hard. Furthermore, we propose a novel Dual-Agent Deep Reinforcement Learning algorithm with Node First and Link features(NF_L_DA_DRL) based on the policy network, to optimize the benefits generated by offloading multi-resource task requests in MEC networks. Finally, simulation results show that the proposed algorithm can effectively improve the benefit of task offloading with higher resource utilization compared with baseline algorithms.展开更多
基金Funded by the National Key Research and Development Program of China(No.2018YFC0705400)National Natural Science Foundation of China(No.51678142)the Fundamental Research Funds for the Central Universities。
文摘Ultra-high performance cement-based composites (UHPCC) is promising in construction of concrete structures that suffer impact and explosive loads.In this study,a reference UHPCC mixture with no fiber reinforcement and four mixtures with a single type of fiber reinforcement or hybrid fiber reinforcements of straight smooth and end hook type of steel fibers were prepared.Split Hopkinson pressure bar (SHPB) was performed to investigate the dynamic compression behavior of UHPCC and X-CT test and 3D reconstruction technology were used to indicate the failure process of UHPCC under impact loading.Results show that UHPCC with 1% straight smooth fiber and 2% end hook fiber reinforcements demonstrated the best static and dynamic mechanical properties.When the hybrid steel fiber reinforcements are added in the concrete,it may need more impact energy to break the matrix and to pull out the fiber reinforcements,thus,the mixture with hybrid steel fiber reinforcements demonstrates excellent dynamic compressive performance.
基金This work was financially supported by the National Natural Science Foundation of China(Nos.U1502274,51834009,and 51974244).
文摘Copper matrix composites reinforced by in situ-formed hybrid titanium boride whiskers(TiB_(w))and titanium diboride particles(TiB_(2p))were fabricated by powder metallurgy.Microstructural observations showed competitive precipitation behavior between TiB_(w) and TiB_(2p),where the relative contents of the two reinforcements varied with sintering temperature.Based on thermodynamic and kinetic assessments,the precipitation mechanisms of the hybrid reinforcements were discussed,and the formation of both TiB_(w) and TiB_(2p) from the local melting zone was thermodynamically favored.The precipitation kinetics were mainly controlled by a solid-state diffusion of B atoms.By forming a compact compound layer,in situ reactions were divided into two stages,where Zener growth and Dybkov growth prevailed,respectively.Accordingly,the competitive precipitation behavior was attributed to the transition of the growth model during the reaction process.
基金supported in part by NSFC (62102099, U22A2054, 62101594)in part by the Pearl River Talent Recruitment Program (2021QN02S643)+9 种基金Guangzhou Basic Research Program (2023A04J1699)in part by the National Research Foundation, SingaporeInfocomm Media Development Authority under its Future Communications Research Development ProgrammeDSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programmeDesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programmeMOE Tier 1 under Grant RG87/22in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165)in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)。
文摘Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
文摘Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning techniques have emerged as promising tools in stroke medicine,enabling efficient analysis of large-scale datasets and facilitating personalized and precision medicine approaches.This abstract provides a comprehensive overview of machine learning’s applications,challenges,and future directions in stroke medicine.Recently introduced machine learning algorithms have been extensively employed in all the fields of stroke medicine.Machine learning models have demonstrated remarkable accuracy in imaging analysis,diagnosing stroke subtypes,risk stratifications,guiding medical treatment,and predicting patient prognosis.Despite the tremendous potential of machine learning in stroke medicine,several challenges must be addressed.These include the need for standardized and interoperable data collection,robust model validation and generalization,and the ethical considerations surrounding privacy and bias.In addition,integrating machine learning models into clinical workflows and establishing regulatory frameworks are critical for ensuring their widespread adoption and impact in routine stroke care.Machine learning promises to revolutionize stroke medicine by enabling precise diagnosis,tailored treatment selection,and improved prognostication.Continued research and collaboration among clinicians,researchers,and technologists are essential for overcoming challenges and realizing the full potential of machine learning in stroke care,ultimately leading to enhanced patient outcomes and quality of life.This review aims to summarize all the current implications of machine learning in stroke diagnosis,treatment,and prognostic evaluation.At the same time,another purpose of this paper is to explore all the future perspectives these techniques can provide in combating this disabling disease.
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金funding support from the Young Fund of Natural Science Foundation of Shaanxi province,China(No.2020JQ-121)Fundamental Research Funds for the Central Universities,China(No.31020190502002)。
文摘The forming of textile reinforcements is an important stage in the manufacturing of textile composite parts with Liquid Composite Molding process.Fiber orientations and part geometry obtained from this stage have significant impact on the subsequent resin injection and final mechanical properties of composite part.Numerical simulation of textile reinforcement forming is in strong demand as it can greatly reduce the time and cost in the determination of the optimized processing parameters,which is the foundation of the low-cost application of composite materials.This review presents the state of the art of forming modeling methods for textile reinforcement and the corresponding experimental characterization methods developed in this field.The microscopic,mesoscopic and macroscopic models are discussed.Studies concerning the simulation of wrinkling are also presented since it is the most common defect occurred in the textile reinforcement forming.Finally,challenges and recommendations on the future research directions for textile reinforcement modeling and experimental characterization are provided.
基金supported in part by the National Key Laboratory of Air-based Information Perception and Fusion and the Aeronautical Science Foundation of China (Grant No. 20220001068001)National Natural Science Foundation of China (Grant No.61673327)+1 种基金Natural Science Basic Research Plan in Shaanxi Province,China (Grant No. 2023-JC-QN-0733)China IndustryUniversity-Research Innovation Foundation (Grant No. 2022IT188)。
文摘Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.
基金supported in part by the Start-Up Grant-Nanyang Assistant Professorship Grant of Nanyang Technological Universitythe Agency for Science,Technology and Research(A*STAR)under Advanced Manufacturing and Engineering(AME)Young Individual Research under Grant(A2084c0156)+2 种基金the MTC Individual Research Grant(M22K2c0079)the ANR-NRF Joint Grant(NRF2021-NRF-ANR003 HM Science)the Ministry of Education(MOE)under the Tier 2 Grant(MOE-T2EP50222-0002)。
文摘While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present a novel robust reinforcement learning approach with safety guarantees to attain trustworthy decision-making for autonomous vehicles.The proposed technique ensures decision trustworthiness in terms of policy robustness and collision safety.Specifically,an adversary model is learned online to simulate the worst-case uncertainty by approximating the optimal adversarial perturbations on the observed states and environmental dynamics.In addition,an adversarial robust actor-critic algorithm is developed to enable the agent to learn robust policies against perturbations in observations and dynamics.Moreover,we devise a safety mask to guarantee the collision safety of the autonomous driving agent during both the training and testing processes using an interpretable knowledge model known as the Responsibility-Sensitive Safety Model.Finally,the proposed approach is evaluated through both simulations and experiments.These results indicate that the autonomous driving agent can make trustworthy decisions and drastically reduce the number of collisions through robust safety policies.
基金the scientific research foundation of Zhejiang Provincial Natural Science Foundation of China (LTGG24E090002)Zhejiang University of Water Resources and Electric Power (xky2022013)+1 种基金Major Science and Technology Plan Project of Zhejiang Provincial Department of Water Resources (RA1904)the water conservancy management department, Zhejiang Design Institute of Water Conservancy and Hydro Electric Power Co., Ltd. and the construction company for their support。
文摘The stability of the ancient flood control levees is mainly influenced by water level fluctuations, groundwater concentration and rainfalls. This paper takes the Lanxi ancient levee as a research object to study the evolution laws of its seepage, displacement and stability before and after reinforcement with the upside-down hanging wells and grouting curtain through numerical simulation methods combined with experiments and observations. The study results indicate that the filled soil is less affected by water level fluctuations and groundwater concentration after reinforcement. A high groundwater level is detrimental to the levee's long-term stability, and the drainage issues need to be fully considered. The deformation of the reinforced levee is effectively controlled since the fill deformation is mainly borne by the upside-down hanging wells. The safety factors of the levee before reinforcement vary significantly with the water level. The minimum value of the safety factors is 0.886 during the water level decreasing period, indicating a very high risk of the instability. While it reached 1.478 after reinforcement, the stability of the ancient levee is improved by a large margin.
基金National Natural Science Foundation of China(Grant Nos.61803348,62173312,51922009)Shanxi Province Key Laboratory of Quantum Sensing and Precision Measurement(Grant No.201905D121001).
文摘Most researches associated with target encircling control are focused on moving along a circular orbit under an ideal environment free from external disturbances.However,elliptical encirclement with a time-varying observation radius,may permit a more flexible and high-efficacy enclosing solution,whilst the non-orthogonal property between axial and tangential speed components,non-ignorable environmental perturbations,and strict assignment requirements empower elliptical encircling control to be more challenging,and the relevant investigations are still open.Following this line,an appointed-time elliptical encircling control rule capable of reinforcing circumnavigation performances is developed to enable Unmanned Aerial Vehicles(UAVs)to move along a specified elliptical path within a predetermined reaching time.The remarkable merits of the designed strategy are that the relative distance controlling error can be guaranteed to evolve within specified regions with a designer-specified convergence behavior.Meanwhile,wind perturbations can be online counteracted based on an unknown system dynamics estimator(USDE)with only one regulating parameter and high computational efficiency.Lyapunov tool demonstrates that all involved error variables are ultimately limited,and simulations are implemented to confirm the usability of the suggested control algorithm.
基金the financial support of the National Key Research and Development Program of China(2020AAA0108100)the Shanghai Municipal Science and Technology Major Project(2021SHZDZX0100)the Shanghai Gaofeng and Gaoyuan Project for University Academic Program Development for funding。
文摘Decision-making and motion planning are extremely important in autonomous driving to ensure safe driving in a real-world environment.This study proposes an online evolutionary decision-making and motion planning framework for autonomous driving based on a hybrid data-and model-driven method.First,a data-driven decision-making module based on deep reinforcement learning(DRL)is developed to pursue a rational driving performance as much as possible.Then,model predictive control(MPC)is employed to execute both longitudinal and lateral motion planning tasks.Multiple constraints are defined according to the vehicle’s physical limit to meet the driving task requirements.Finally,two principles of safety and rationality for the self-evolution of autonomous driving are proposed.A motion envelope is established and embedded into a rational exploration and exploitation scheme,which filters out unreasonable experiences by masking unsafe actions so as to collect high-quality training data for the DRL agent.Experiments with a high-fidelity vehicle model and MATLAB/Simulink co-simulation environment are conducted,and the results show that the proposed online-evolution framework is able to generate safer,more rational,and more efficient driving action in a real-world environment.
基金State Grid Corporation of China Science and Technology Project“Research andApplication of Key Technologies for Trusted Issuance and Security Control of Electronic Licenses for Power Business”(5700-202353318A-1-1-ZN).
文摘To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQuality of Service (QoS) requirements, revealing the inadequacies of traditional routing allocation mechanismsin accommodating such extensive data flows. In response to the imperative of handling a substantial influx of datarequests promptly and alleviating the constraints of existing technologies and network congestion, we present anarchitecture forQoS routing optimizationwith in SoftwareDefinedNetwork (SDN), leveraging deep reinforcementlearning. This innovative approach entails the separation of SDN control and transmission functionalities, centralizingcontrol over data forwardingwhile integrating deep reinforcement learning for informed routing decisions. Byfactoring in considerations such as delay, bandwidth, jitter rate, and packet loss rate, we design a reward function toguide theDeepDeterministic PolicyGradient (DDPG) algorithmin learning the optimal routing strategy to furnishsuperior QoS provision. In our empirical investigations, we juxtapose the performance of Deep ReinforcementLearning (DRL) against that of Shortest Path (SP) algorithms in terms of data packet transmission delay. Theexperimental simulation results show that our proposed algorithm has significant efficacy in reducing networkdelay and improving the overall transmission efficiency, which is superior to the traditional methods.
基金the Liaoning Province Applied Basic Research Program,2023JH2/101600038.
文摘In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforcement learning can solve the problem of real-time updating,its prediction results are always higher than the actual results.In Botnet traffic detection,although it performs well in the training set,the accuracy rate of predicting traffic is as high as%;however,in the test set,its accuracy has declined,and it is impossible to adjust its prediction strategy on time based on new data samples.However,in the new dataset,its accuracy has declined significantly.Therefore,this paper proposes a Botnet traffic detection system based on double-layer DQN(DDQN).Two Q-values are designed to adjust the model in policy and action,respectively,to achieve real-time model updates and improve the universality and robustness of the model under different data sets.Experiments show that compared with the DQN model,when using DDQN,the Q-value is not too high,and the detectionmodel has improved the accuracy and precision of Botnet traffic.Moreover,when using Botnet data sets other than the test set,the accuracy and precision of theDDQNmodel are still higher than DQN.
基金National Natural Science Foundation of China(61973037)National 173 Program Project(2019-JCJQ-ZD-324).
文摘To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-learning algorithm is proposed.First,dividing the distance between the missile and the target into multiple states to increase the quantity of state spaces.Second,a multidimensional motion space is utilized,and the search range of which changes with the distance of the projectile,to select parameters and minimize the amount of ineffective interference parameters.The interference effect is determined by detecting whether the fuze signal disappears.Finally,a weighted reward function is used to determine the reward value based on the range state,output power,and parameter quantity information of the interference form.The effectiveness of the proposed method in selecting the range of motion space parameters and designing the discrimination degree of the reward function has been verified through offline experiments involving full-range missile rendezvous.The optimal interference form for each distance state has been obtained.Compared with the single-interference decision method,the proposed decision method can effectively improve the success rate of interference.
基金financially supported by National Natural Science Foundation of China(Grant Nos.52088102,51879249)Fundamental Research Funds for the Central Universities(Grant No.202261055)。
文摘The collapse pressure is a key parameter when RTPs are applied in harsh deep-water environments.To investigate the collapse of RTPs,numerical simulations and hydrostatic pressure tests are conducted.For the numerical simulations,the eigenvalue analysis and Riks analysis are combined,in which the Hashin failure criterion and fracture energy stiffness degradation model are used to simulate the progressive failure of composites,and the“infinite”boundary conditions are applied to eliminate the boundary effects.As for the hydrostatic pressure tests,RTP specimens were placed in a hydrostatic chamber after filled with water.It has been observed that the cross-section of the middle part collapses when it reaches the maximum pressure.The collapse pressure obtained from the numerical simulations agrees well with that in the experiment.Meanwhile,the applicability of NASA SP-8007 formula on the collapse pressure prediction was also discussed.It has a relatively greater difference because of the ignorance of the progressive failure of composites.For the parametric study,it is found that RTPs have much higher first-ply-failure pressure when the winding angles are between 50°and 70°.Besides,the effect of debonding and initial ovality,and the contribution of the liner and coating are also discussed.
基金supported in part by the National Natural Science Foundation of China under Grant Nos.62173312 and 61803348in part by the National Major Scientific Instruments Development Project under Grant No.61927807+3 种基金in part by the Program for the Innovative Talents of Higher Education Institutions of ShanxiShanxi Province Science Foundation for Excellent Youthsin part by the Shanxi"1331 Project"Key Subjects Construction(1331KSC)in part by Graduate Innovation Project of Shanxi Province under Grant No.2021Y617。
文摘This article investigates a multi-circular path-following formation control with reinforced transient profiles for nonholonomic vehicles connected by a digraph.A multi-circular formation controller endowed with the feature of spatial-temporal decoupling is devised for a group of vehicles guided by a virtual leader evolving along an implicit path,which allows for a circumnavigation on multiple circles with an anticipant angular spacing.In addition,notice that it typically imposes a stringent time constraint on time-sensitive enclosing scenarios,hence an improved prescribed performance control(IPPC)using novel tighter behavior boundaries is presented to enhance transient capabilities with an ensured appointed-time convergence free from any overshoots.The significant merits are that coordinated circumnavigation along different circles can be realized via executing geometric and dynamic assignments independently with modified transient profiles.Furthermore,all variables existing in the entire system are analyzed to be convergent.Simulation and experimental results are provided to validate the utility of suggested solution.
基金supported in part by the National Key R&D Program of China under Grant 2021YFB2011300the National Natural Science Foundation of China under Grant 52075262。
文摘This paper mainly focuses on the development of a learning-based controller for a class of uncertain mechanical systems modeled by the Euler-Lagrange formulation.The considered system can depict the behavior of a large class of engineering systems,such as vehicular systems,robot manipulators and satellites.All these systems are often characterized by highly nonlinear characteristics,heavy modeling uncertainties and unknown perturbations,therefore,accurate-model-based nonlinear control approaches become unavailable.Motivated by the challenge,a reinforcement learning(RL)adaptive control methodology based on the actor-critic framework is investigated to compensate the uncertain mechanical dynamics.The approximation inaccuracies caused by RL and the exogenous unknown disturbances are circumvented via a continuous robust integral of the sign of the error(RISE)control approach.Different from a classical RISE control law,a tanh(·)function is utilized instead of a sign(·)function to acquire a more smooth control signal.The developed controller requires very little prior knowledge of the dynamic model,is robust to unknown dynamics and exogenous disturbances,and can achieve asymptotic output tracking.Eventually,co-simulations through ADAMS and MATLAB/Simulink on a three degrees-of-freedom(3-DOF)manipulator and experiments on a real-time electromechanical servo system are performed to verify the performance of the proposed approach.
文摘As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.
基金supported by the National Natural Science Foundation of China(Grant No.12072090)。
文摘This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.
基金supported in part by the National Natural Science Foundation of China under Grants 62201105,62331017,and 62075024in part by the Natural Science Foundation of Chongqing under Grant cstc2021jcyj-msxmX0404+1 种基金in part by the Chongqing Municipal Education Commission under Grant KJQN202100643in part by Guangdong Basic and Applied Basic Research Foundation under Grant 2022A1515110056.
文摘The Multi-access Edge Cloud(MEC) networks extend cloud computing services and capabilities to the edge of the networks. By bringing computation and storage capabilities closer to end-users and connected devices, MEC networks can support a wide range of applications. MEC networks can also leverage various types of resources, including computation resources, network resources, radio resources,and location-based resources, to provide multidimensional resources for intelligent applications in 5/6G.However, tasks generated by users often consist of multiple subtasks that require different types of resources. It is a challenging problem to offload multiresource task requests to the edge cloud aiming at maximizing benefits due to the heterogeneity of resources provided by devices. To address this issue,we mathematically model the task requests with multiple subtasks. Then, the problem of task offloading of multi-resource task requests is proved to be NP-hard. Furthermore, we propose a novel Dual-Agent Deep Reinforcement Learning algorithm with Node First and Link features(NF_L_DA_DRL) based on the policy network, to optimize the benefits generated by offloading multi-resource task requests in MEC networks. Finally, simulation results show that the proposed algorithm can effectively improve the benefit of task offloading with higher resource utilization compared with baseline algorithms.