Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers...Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.展开更多
To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-lea...To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-learning algorithm is proposed.First,dividing the distance between the missile and the target into multiple states to increase the quantity of state spaces.Second,a multidimensional motion space is utilized,and the search range of which changes with the distance of the projectile,to select parameters and minimize the amount of ineffective interference parameters.The interference effect is determined by detecting whether the fuze signal disappears.Finally,a weighted reward function is used to determine the reward value based on the range state,output power,and parameter quantity information of the interference form.The effectiveness of the proposed method in selecting the range of motion space parameters and designing the discrimination degree of the reward function has been verified through offline experiments involving full-range missile rendezvous.The optimal interference form for each distance state has been obtained.Compared with the single-interference decision method,the proposed decision method can effectively improve the success rate of interference.展开更多
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
The electron's charge and spin degrees of freedom are at the core of modern electronic devices. With the in-depth investigation of two-dimensional materials, another degree of freedom, valley, has also attracted t...The electron's charge and spin degrees of freedom are at the core of modern electronic devices. With the in-depth investigation of two-dimensional materials, another degree of freedom, valley, has also attracted tremendous research interest. The intrinsic spontaneous valley polarization in two-dimensional magnetic systems, ferrovalley material, provides convenience for detecting and modulating the valley. In this review, we first introduce the development of valleytronics.Then, the valley polarization forms by the p-, d-, and f-orbit that are discussed. Following, we discuss the investigation progress of modulating the valley polarization of two-dimensional ferrovalley materials by multiple physical fields, such as electric, stacking mode, strain, and interface. Finally, we look forward to the future developments of valleytronics.展开更多
Contract Bridge,a four-player imperfect information game,comprises two phases:bidding and playing.While computer programs excel at playing,bidding presents a challenging aspect due to the need for information exchange...Contract Bridge,a four-player imperfect information game,comprises two phases:bidding and playing.While computer programs excel at playing,bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents.In this work,we introduce a Bridge bidding agent that combines supervised learning,deep reinforcement learning via self-play,and a test-time search approach.Our experiments demonstrate that our agent outperforms WBridge5,a highly regarded computer Bridge software that has won multiple world championships,by a performance of 0.98 IMPs(international match points)per deal over 10000 deals,with a much cost-effective approach.The performance significantly surpasses previous state-of-the-art(0.85 IMPs per deal).Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.展开更多
This paper focuses on the scheduling problem of workflow tasks that exhibit interdependencies.Unlike indepen-dent batch tasks,workflows typically consist of multiple subtasks with intrinsic correlations and dependenci...This paper focuses on the scheduling problem of workflow tasks that exhibit interdependencies.Unlike indepen-dent batch tasks,workflows typically consist of multiple subtasks with intrinsic correlations and dependencies.It necessitates the distribution of various computational tasks to appropriate computing node resources in accor-dance with task dependencies to ensure the smooth completion of the entire workflow.Workflow scheduling must consider an array of factors,including task dependencies,availability of computational resources,and the schedulability of tasks.Therefore,this paper delves into the distributed graph database workflow task scheduling problem and proposes a workflow scheduling methodology based on deep reinforcement learning(DRL).The method optimizes the maximum completion time(makespan)and response time of workflow tasks,aiming to enhance the responsiveness of workflow tasks while ensuring the minimization of the makespan.The experimental results indicate that the Q-learning Deep Reinforcement Learning(Q-DRL)algorithm markedly diminishes the makespan and refines the average response time within distributed graph database environments.In quantifying makespan,Q-DRL achieves mean reductions of 12.4%and 11.9%over established First-fit and Random scheduling strategies,respectively.Additionally,Q-DRL surpasses the performance of both DRL-Cloud and Improved Deep Q-learning Network(IDQN)algorithms,with improvements standing at 4.4%and 2.6%,respectively.With reference to average response time,the Q-DRL approach exhibits a significantly enhanced performance in the scheduling of workflow tasks,decreasing the average by 2.27%and 4.71%when compared to IDQN and DRL-Cloud,respectively.The Q-DRL algorithm also demonstrates a notable increase in the efficiency of system resource utilization,reducing the average idle rate by 5.02%and 9.30%in comparison to IDQN and DRL-Cloud,respectively.These findings support the assertion that Q-DRL not only upholds a lower average idle rate but also effectively curtails the average response time,thereby substantially improving processing efficiency and optimizing resource utilization within distributed graph database systems.展开更多
Two-dimensional(2D)materials have attracted tremendous interest in view of the outstanding optoelectronic properties,showing new possibilities for future photovoltaic devices toward high performance,high specific powe...Two-dimensional(2D)materials have attracted tremendous interest in view of the outstanding optoelectronic properties,showing new possibilities for future photovoltaic devices toward high performance,high specific power and flexibility.In recent years,substantial works have focused on 2D photovoltaic devices,and great progress has been achieved.Here,we present the review of recent advances in 2D photovoltaic devices,focusing on 2D-material-based Schottky junctions,homojunctions,2D−2D heterojunctions,2D−3D heterojunctions,and bulk photovoltaic effect devices.Furthermore,advanced strategies for improving the photovoltaic performances are demonstrated in detail.Finally,conclusions and outlooks are delivered,providing a guideline for the further development of 2D photovoltaic devices.展开更多
Unconventional antiferromagnetism dubbed as altermagnetism was first discovered in rutile structured magnets,which is featured by spin splitting even without the spin–orbital coupling effect.This interesting phenomen...Unconventional antiferromagnetism dubbed as altermagnetism was first discovered in rutile structured magnets,which is featured by spin splitting even without the spin–orbital coupling effect.This interesting phenomenon has been discovered in more altermagnetic materials.In this work,we explore two-dimensional altermagnetic materials by studying two series of two-dimensional magnets,including MF4 with M covering all 3d and 4d transition metal elements,as well as TS2 with T=V,Cr,Mn,Fe.Through the magnetic symmetry operation of RuF4 and MnS2,it is verified that breaking the time inversion is a necessary condition for spin splitting.Based on symmetry analysis and first-principles calculations,we find that the electronic bands and magnon dispersion experience alternating spin splitting along the same path.This work paves the way for exploring altermagnetism in two-dimensional materials.展开更多
Antimony-based anodes have attracted wide attention in potassium-ion batteries due to their high theoretical specific capacities(∼660 mA h g^(-1))and suitable voltage platforms.However,severe capacity fading caused b...Antimony-based anodes have attracted wide attention in potassium-ion batteries due to their high theoretical specific capacities(∼660 mA h g^(-1))and suitable voltage platforms.However,severe capacity fading caused by huge volume change and limited ion transportation hinders their practical applications.Recently,strategies for controlling the morphologies of Sb-based materials to improve the electrochemical performances have been proposed.Among these,the two-dimensional Sb(2D-Sb)materials present excellent properties due to shorted ion immigration paths and enhanced ion diffusion.Nevertheless,the synthetic methods are usually tedious,and even the mechanism of these strategies remains elusive,especially how to obtain large-scale 2D-Sb materials.Herein,a novel strategy to synthesize 2D-Sb material using a straightforward solvothermal method without the requirement of a complex nanostructure design is provided.This method leverages the selective adsorption of aldehyde groups in furfural to induce crystal growth,while concurrently reducing and coating a nitrogen-doped carbon layer.Compared to the reported methods,it is simpler,more efficient,and conducive to the production of composite nanosheets with uniform thickness(3–4 nm).The 2D-Sb@NC nanosheet anode delivers an extremely high capacity of 504.5 mA h g^(-1) at current densities of 100 mA g^(-1) and remains stable for more than 200 cycles.Through characterizations and molecular dynamic simulations,how potassium storage kinetics between 2D Sb-based materials and bulk Sb-based materials are explored,and detailed explanations are provided.These findings offer novel insights into the development of durable 2D alloy-based anodes for next-generation potassium-ion batteries.展开更多
Valleytronics, using valley degree of freedom to encode, process, and store information, may find practical applications in low-power-consumption devices. Recent theoretical and experimental studies have demonstrated ...Valleytronics, using valley degree of freedom to encode, process, and store information, may find practical applications in low-power-consumption devices. Recent theoretical and experimental studies have demonstrated that twodimensional(2D) honeycomb lattice systems with inversion symmetry breaking, such as transition-metal dichalcogenides(TMDs), are ideal candidates for realizing valley polarization. In addition to the optical field, lifting the valley degeneracy of TMDs by introducing magnetism is an efficient way to manipulate the valley degree of freedom. In this paper, we first review the recent progress on valley polarization in various TMD-based systems, including magnetically doped TMDs,intrinsic TMDs with both inversion and time-reversal symmetry broken, and magnetic TMD heterostructures. When topologically nontrivial bands are empowered into valley-polarized systems, valley-polarized topological states, namely valleypolarized quantum anomalous Hall effect can be realized. Therefore, we have also reviewed the theoretical proposals for realizing valley-polarized topological states in 2D honeycomb lattices. Our paper can help readers quickly grasp the latest research developments in this field.展开更多
To improve the comprehensive mechanical properties of Al-Si-Cu alloy,it was treated by a high-pressure torsion process,and the effect of the deformation degree on the microstructure and properties of the Al-Si-Cu allo...To improve the comprehensive mechanical properties of Al-Si-Cu alloy,it was treated by a high-pressure torsion process,and the effect of the deformation degree on the microstructure and properties of the Al-Si-Cu alloy was studied.The results show that the reinforcements(β-Si andθ-CuAl_(2)phases)of the Al-Si-Cu alloy are dispersed in theα-Al matrix phase with finer phase size after the treatment.The processed samples exhibit grain sizes in the submicron or even nanometer range,which effectively improves the mechanical properties of the material.The hardness and strength of the deformed alloy are both significantly raised to 268 HV and 390.04 MPa by 10 turns HPT process,and the fracture morphology shows that the material gradually transits from brittle to plastic before and after deformation.The elements interdiffusion at the interface between the phases has also been effectively enhanced.In addition,it is found that the severe plastic deformation at room temperature induces a ternary eutectic reaction,resulting in the formation of ternary Al+Si+CuAl_(2)eutectic.展开更多
The driven-dissipative Langevin dynamics simulation is used to produce a two-dimensional(2D) dense cloud, which is composed of charged dust particles trapped in a quadratic potential. A 2D mesh grid is built to analyz...The driven-dissipative Langevin dynamics simulation is used to produce a two-dimensional(2D) dense cloud, which is composed of charged dust particles trapped in a quadratic potential. A 2D mesh grid is built to analyze the center-to-wall dust density. It is found that the local dust density in the outer region relative to that of the inner region is more nonuniform,being consistent with the feature of quadratic potential. The dependences of the global dust density on equilibrium temperature, particle size, confinement strength, and confinement shape are investigated. It is found that the particle size, the confinement strength, and the confinement shape strongly affect the global dust density, while the equilibrium temperature plays a minor effect on it. In the direction where there is a stronger confinement, the dust density gradient is bigger.展开更多
Two-dimensional Ruddlesden-Popper(2DRP)perovskite exhibits excellent stability in perovskite solar cells(PSCs)due to introducing hydrophobic long-chain organic spacers.However,the poor charge transporting property of ...Two-dimensional Ruddlesden-Popper(2DRP)perovskite exhibits excellent stability in perovskite solar cells(PSCs)due to introducing hydrophobic long-chain organic spacers.However,the poor charge transporting property of bulky organic cation spacers limits the performance of 2DRP PSCs.Inspired by the Asite cation alloying strategy in 3D perovskites,2DRP perovskites with a binary spacer can promote charge transporting compared to the unary spacer counterparts.Herein,the superior MA-based 2DRP perovskite films with a binary spacer,including 3-guanidinopropanoic acid(GPA)and 4-fluorophenethylamine(FPEA)are realized.These films(GPA_(0.85)FPEA_(0.15))_(2)MA_(4)Pb_5I_(16)show good morphology,large grain size,decreased trap state density,and preferential orientation of the as-prepared film.Accordingly,the present 2DRP-based PSC with the binary spacer achieves a remarkable efficiency of 18.37%with a V_(OC)of1.15 V,a J_(SC)of 20.13 mA cm^(-2),and an FF of 79.23%.To our knowledge,the PCE value should be the highest for binary spacer MA-based 2DRP(n≤5)PSCs to date.Importantly,owing to the hydrophobic fluorine group of FPEA and the enhanced interlayer interaction by FPEA,the unencapsulated 2DRP PSCs based on binary spacers exhibit much excellent humidity stability and thermal stability than the unary spacer counterparts.展开更多
To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQu...To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQuality of Service (QoS) requirements, revealing the inadequacies of traditional routing allocation mechanismsin accommodating such extensive data flows. In response to the imperative of handling a substantial influx of datarequests promptly and alleviating the constraints of existing technologies and network congestion, we present anarchitecture forQoS routing optimizationwith in SoftwareDefinedNetwork (SDN), leveraging deep reinforcementlearning. This innovative approach entails the separation of SDN control and transmission functionalities, centralizingcontrol over data forwardingwhile integrating deep reinforcement learning for informed routing decisions. Byfactoring in considerations such as delay, bandwidth, jitter rate, and packet loss rate, we design a reward function toguide theDeepDeterministic PolicyGradient (DDPG) algorithmin learning the optimal routing strategy to furnishsuperior QoS provision. In our empirical investigations, we juxtapose the performance of Deep ReinforcementLearning (DRL) against that of Shortest Path (SP) algorithms in terms of data packet transmission delay. Theexperimental simulation results show that our proposed algorithm has significant efficacy in reducing networkdelay and improving the overall transmission efficiency, which is superior to the traditional methods.展开更多
The anomalous valley Hall effect(AVHE)can be used to explore and utilize valley degrees of freedom in materials,which has potential applications in fields such as information storage,quantum computing and optoelectron...The anomalous valley Hall effect(AVHE)can be used to explore and utilize valley degrees of freedom in materials,which has potential applications in fields such as information storage,quantum computing and optoelectronics.AVHE exists in two-dimensional(2D)materials possessing valley polarization(VP),and such 2D materials usually belong to the hexagonal honeycomb lattice.Therefore,it is necessary to achieve valleytronic materials with VP that are more readily to be synthesized and applicated experimentally.In this topical review,we introduce recent developments on realizing VP as well as AVHE through different methods,i.e.,doping transition metal atoms,building ferrovalley heterostructures and searching for ferrovalley materials.Moreover,2D ferrovalley systems under external modulation are also discussed.2D valleytronic materials with AVHE demonstrate excellent performance and potential applications,which offer the possibility of realizing novel low-energy-consuming devices,facilitating further development of device technology,realizing miniaturization and enhancing functionality of them.展开更多
A real-time adaptive roles allocation method based on reinforcement learning is proposed to improve humanrobot cooperation performance for a curtain wall installation task.This method breaks the traditional idea that ...A real-time adaptive roles allocation method based on reinforcement learning is proposed to improve humanrobot cooperation performance for a curtain wall installation task.This method breaks the traditional idea that the robot is regarded as the follower or only adjusts the leader and the follower in cooperation.In this paper,a self-learning method is proposed which can dynamically adapt and continuously adjust the initiative weight of the robot according to the change of the task.Firstly,the physical human-robot cooperation model,including the role factor is built.Then,a reinforcement learningmodel that can adjust the role factor in real time is established,and a reward and actionmodel is designed.The role factor can be adjusted continuously according to the comprehensive performance of the human-robot interaction force and the robot’s Jerk during the repeated installation.Finally,the roles adjustment rule established above continuously improves the comprehensive performance.Experiments of the dynamic roles allocation and the effect of the performance weighting coefficient on the result have been verified.The results show that the proposed method can realize the role adaptation and achieve the dual optimization goal of reducing the sum of the cooperator force and the robot’s Jerk.展开更多
One hallmark of glasses is the existence of excess vibrational modes at low frequenciesωbeyond Debye’s prediction.Numerous studies suggest that understanding low-frequency excess vibrations could help gain insight i...One hallmark of glasses is the existence of excess vibrational modes at low frequenciesωbeyond Debye’s prediction.Numerous studies suggest that understanding low-frequency excess vibrations could help gain insight into the anomalous mechanical and thermodynamic properties of glasses.However,there is still intensive debate as to the frequency dependence of the population of low-frequency excess vibrations.In particular,excess modes could hybridize with phonon-like modes and the density of hybridized excess modes has been reported to follow D_(exc)(ω)~ω^(2)in 2D glasses with an inverse power law potential.Yet,the universality of the quadratic scaling remains unknown,since recent work suggested that interaction potentials could influence the scaling of the vibrational spectrum.Here,we extend the universality of the quadratic scaling for hybridized excess modes in 2D to glasses with potentials ranging from the purely repulsive soft-core interaction to the hard-core one with both repulsion and attraction as well as to glasses with significant differences in density or interparticle repulsion.Moreover,we observe that the number of hybridized excess modes exhibits a decrease in glasses with higher density or steeper interparticle repulsion,which is accompanied by a suppression of the strength of the sound attenuation.Our results indicate that the density bears some resemblance to the repulsive steepness of the interaction in influencing low-frequency properties.展开更多
This survey paper provides a review and perspective on intermediate and advanced reinforcement learning(RL)techniques in process industries. It offers a holistic approach by covering all levels of the process control ...This survey paper provides a review and perspective on intermediate and advanced reinforcement learning(RL)techniques in process industries. It offers a holistic approach by covering all levels of the process control hierarchy. The survey paper presents a comprehensive overview of RL algorithms,including fundamental concepts like Markov decision processes and different approaches to RL, such as value-based, policy-based, and actor-critic methods, while also discussing the relationship between classical control and RL. It further reviews the wide-ranging applications of RL in process industries, such as soft sensors, low-level control, high-level control, distributed process control, fault detection and fault tolerant control, optimization,planning, scheduling, and supply chain. The survey paper discusses the limitations and advantages, trends and new applications, and opportunities and future prospects for RL in process industries. Moreover, it highlights the need for a holistic approach in complex systems due to the growing importance of digitalization in the process industries.展开更多
The development of communication technology will promote the application of Internet of Things,and Beyond 5G will become a new technology promoter.At the same time,Beyond 5G will become one of the important supports f...The development of communication technology will promote the application of Internet of Things,and Beyond 5G will become a new technology promoter.At the same time,Beyond 5G will become one of the important supports for the development of edge computing technology.This paper proposes a communication task allocation algorithm based on deep reinforcement learning for vehicle-to-pedestrian communication scenarios in edge computing.Through trial and error learning of agent,the optimal spectrum and power can be determined for transmission without global information,so as to balance the communication between vehicle-to-pedestrian and vehicle-to-infrastructure.The results show that the agent can effectively improve vehicle-to-infrastructure communication rate as well as meeting the delay constraints on the vehicle-to-pedestrian link.展开更多
While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present...While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present a novel robust reinforcement learning approach with safety guarantees to attain trustworthy decision-making for autonomous vehicles.The proposed technique ensures decision trustworthiness in terms of policy robustness and collision safety.Specifically,an adversary model is learned online to simulate the worst-case uncertainty by approximating the optimal adversarial perturbations on the observed states and environmental dynamics.In addition,an adversarial robust actor-critic algorithm is developed to enable the agent to learn robust policies against perturbations in observations and dynamics.Moreover,we devise a safety mask to guarantee the collision safety of the autonomous driving agent during both the training and testing processes using an interpretable knowledge model known as the Responsibility-Sensitive Safety Model.Finally,the proposed approach is evaluated through both simulations and experiments.These results indicate that the autonomous driving agent can make trustworthy decisions and drastically reduce the number of collisions through robust safety policies.展开更多
基金supported in part by NSFC (62102099, U22A2054, 62101594)in part by the Pearl River Talent Recruitment Program (2021QN02S643)+9 种基金Guangzhou Basic Research Program (2023A04J1699)in part by the National Research Foundation, SingaporeInfocomm Media Development Authority under its Future Communications Research Development ProgrammeDSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programmeDesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programmeMOE Tier 1 under Grant RG87/22in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165)in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)。
文摘Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
基金National Natural Science Foundation of China(61973037)National 173 Program Project(2019-JCJQ-ZD-324).
文摘To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-learning algorithm is proposed.First,dividing the distance between the missile and the target into multiple states to increase the quantity of state spaces.Second,a multidimensional motion space is utilized,and the search range of which changes with the distance of the projectile,to select parameters and minimize the amount of ineffective interference parameters.The interference effect is determined by detecting whether the fuze signal disappears.Finally,a weighted reward function is used to determine the reward value based on the range state,output power,and parameter quantity information of the interference form.The effectiveness of the proposed method in selecting the range of motion space parameters and designing the discrimination degree of the reward function has been verified through offline experiments involving full-range missile rendezvous.The optimal interference form for each distance state has been obtained.Compared with the single-interference decision method,the proposed decision method can effectively improve the success rate of interference.
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金Project supported by the National Natural Science Foundation of China (Grant Nos.12074301 and 12004295)China’s Postdoctoral Science Foundation funded project (Grant No.2022M722547)+1 种基金the Open Project of State Key Laboratory of Surface Physics (Grant No.KF2022 09)the Natural Science Foundation of Guizhou Provincial Education Department (Grant No.ZK[2021]034)。
文摘The electron's charge and spin degrees of freedom are at the core of modern electronic devices. With the in-depth investigation of two-dimensional materials, another degree of freedom, valley, has also attracted tremendous research interest. The intrinsic spontaneous valley polarization in two-dimensional magnetic systems, ferrovalley material, provides convenience for detecting and modulating the valley. In this review, we first introduce the development of valleytronics.Then, the valley polarization forms by the p-, d-, and f-orbit that are discussed. Following, we discuss the investigation progress of modulating the valley polarization of two-dimensional ferrovalley materials by multiple physical fields, such as electric, stacking mode, strain, and interface. Finally, we look forward to the future developments of valleytronics.
文摘Contract Bridge,a four-player imperfect information game,comprises two phases:bidding and playing.While computer programs excel at playing,bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents.In this work,we introduce a Bridge bidding agent that combines supervised learning,deep reinforcement learning via self-play,and a test-time search approach.Our experiments demonstrate that our agent outperforms WBridge5,a highly regarded computer Bridge software that has won multiple world championships,by a performance of 0.98 IMPs(international match points)per deal over 10000 deals,with a much cost-effective approach.The performance significantly surpasses previous state-of-the-art(0.85 IMPs per deal).Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.
基金funded by the Science and Technology Foundation of State Grid Corporation of China(Grant No.5108-202218280A-2-397-XG).
文摘This paper focuses on the scheduling problem of workflow tasks that exhibit interdependencies.Unlike indepen-dent batch tasks,workflows typically consist of multiple subtasks with intrinsic correlations and dependencies.It necessitates the distribution of various computational tasks to appropriate computing node resources in accor-dance with task dependencies to ensure the smooth completion of the entire workflow.Workflow scheduling must consider an array of factors,including task dependencies,availability of computational resources,and the schedulability of tasks.Therefore,this paper delves into the distributed graph database workflow task scheduling problem and proposes a workflow scheduling methodology based on deep reinforcement learning(DRL).The method optimizes the maximum completion time(makespan)and response time of workflow tasks,aiming to enhance the responsiveness of workflow tasks while ensuring the minimization of the makespan.The experimental results indicate that the Q-learning Deep Reinforcement Learning(Q-DRL)algorithm markedly diminishes the makespan and refines the average response time within distributed graph database environments.In quantifying makespan,Q-DRL achieves mean reductions of 12.4%and 11.9%over established First-fit and Random scheduling strategies,respectively.Additionally,Q-DRL surpasses the performance of both DRL-Cloud and Improved Deep Q-learning Network(IDQN)algorithms,with improvements standing at 4.4%and 2.6%,respectively.With reference to average response time,the Q-DRL approach exhibits a significantly enhanced performance in the scheduling of workflow tasks,decreasing the average by 2.27%and 4.71%when compared to IDQN and DRL-Cloud,respectively.The Q-DRL algorithm also demonstrates a notable increase in the efficiency of system resource utilization,reducing the average idle rate by 5.02%and 9.30%in comparison to IDQN and DRL-Cloud,respectively.These findings support the assertion that Q-DRL not only upholds a lower average idle rate but also effectively curtails the average response time,thereby substantially improving processing efficiency and optimizing resource utilization within distributed graph database systems.
基金supported by the National Natural Science Foundation of China(52322210,52172144,22375069,21825103,and U21A2069)National Key R&D Program of China(2021YFA1200501)+1 种基金Shenzhen Science and Technology Program(JCYJ20220818102215033,JCYJ20200109105422876)the Innovation Project of Optics Valley Laboratory(OVL2023PY007).
文摘Two-dimensional(2D)materials have attracted tremendous interest in view of the outstanding optoelectronic properties,showing new possibilities for future photovoltaic devices toward high performance,high specific power and flexibility.In recent years,substantial works have focused on 2D photovoltaic devices,and great progress has been achieved.Here,we present the review of recent advances in 2D photovoltaic devices,focusing on 2D-material-based Schottky junctions,homojunctions,2D−2D heterojunctions,2D−3D heterojunctions,and bulk photovoltaic effect devices.Furthermore,advanced strategies for improving the photovoltaic performances are demonstrated in detail.Finally,conclusions and outlooks are delivered,providing a guideline for the further development of 2D photovoltaic devices.
基金the National Natural Science Foundation of China(Grant No.12004439)Hunan Province Postgraduate Research and Innovation Project(Grant No.CX20230229)the computational resources from the High Performance Computing Center of Central South University.
文摘Unconventional antiferromagnetism dubbed as altermagnetism was first discovered in rutile structured magnets,which is featured by spin splitting even without the spin–orbital coupling effect.This interesting phenomenon has been discovered in more altermagnetic materials.In this work,we explore two-dimensional altermagnetic materials by studying two series of two-dimensional magnets,including MF4 with M covering all 3d and 4d transition metal elements,as well as TS2 with T=V,Cr,Mn,Fe.Through the magnetic symmetry operation of RuF4 and MnS2,it is verified that breaking the time inversion is a necessary condition for spin splitting.Based on symmetry analysis and first-principles calculations,we find that the electronic bands and magnon dispersion experience alternating spin splitting along the same path.This work paves the way for exploring altermagnetism in two-dimensional materials.
基金financially supported by the Science and Technology Development Program of Jilin Province(YDZJ202101ZYTS185)the National Natural Science Foundation of China(21975250)。
文摘Antimony-based anodes have attracted wide attention in potassium-ion batteries due to their high theoretical specific capacities(∼660 mA h g^(-1))and suitable voltage platforms.However,severe capacity fading caused by huge volume change and limited ion transportation hinders their practical applications.Recently,strategies for controlling the morphologies of Sb-based materials to improve the electrochemical performances have been proposed.Among these,the two-dimensional Sb(2D-Sb)materials present excellent properties due to shorted ion immigration paths and enhanced ion diffusion.Nevertheless,the synthetic methods are usually tedious,and even the mechanism of these strategies remains elusive,especially how to obtain large-scale 2D-Sb materials.Herein,a novel strategy to synthesize 2D-Sb material using a straightforward solvothermal method without the requirement of a complex nanostructure design is provided.This method leverages the selective adsorption of aldehyde groups in furfural to induce crystal growth,while concurrently reducing and coating a nitrogen-doped carbon layer.Compared to the reported methods,it is simpler,more efficient,and conducive to the production of composite nanosheets with uniform thickness(3–4 nm).The 2D-Sb@NC nanosheet anode delivers an extremely high capacity of 504.5 mA h g^(-1) at current densities of 100 mA g^(-1) and remains stable for more than 200 cycles.Through characterizations and molecular dynamic simulations,how potassium storage kinetics between 2D Sb-based materials and bulk Sb-based materials are explored,and detailed explanations are provided.These findings offer novel insights into the development of durable 2D alloy-based anodes for next-generation potassium-ion batteries.
文摘Valleytronics, using valley degree of freedom to encode, process, and store information, may find practical applications in low-power-consumption devices. Recent theoretical and experimental studies have demonstrated that twodimensional(2D) honeycomb lattice systems with inversion symmetry breaking, such as transition-metal dichalcogenides(TMDs), are ideal candidates for realizing valley polarization. In addition to the optical field, lifting the valley degeneracy of TMDs by introducing magnetism is an efficient way to manipulate the valley degree of freedom. In this paper, we first review the recent progress on valley polarization in various TMD-based systems, including magnetically doped TMDs,intrinsic TMDs with both inversion and time-reversal symmetry broken, and magnetic TMD heterostructures. When topologically nontrivial bands are empowered into valley-polarized systems, valley-polarized topological states, namely valleypolarized quantum anomalous Hall effect can be realized. Therefore, we have also reviewed the theoretical proposals for realizing valley-polarized topological states in 2D honeycomb lattices. Our paper can help readers quickly grasp the latest research developments in this field.
基金Funded by the National Natural Science Foundation of China(No.51905215)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.SJCX23_1233)+1 种基金Major Scientific and Technological Innovation Project of Shandong Province of China(No.2019JZZY020111)the National College Students Innovation and Entrepreneurship Training Program of China(No.CX2022415)。
文摘To improve the comprehensive mechanical properties of Al-Si-Cu alloy,it was treated by a high-pressure torsion process,and the effect of the deformation degree on the microstructure and properties of the Al-Si-Cu alloy was studied.The results show that the reinforcements(β-Si andθ-CuAl_(2)phases)of the Al-Si-Cu alloy are dispersed in theα-Al matrix phase with finer phase size after the treatment.The processed samples exhibit grain sizes in the submicron or even nanometer range,which effectively improves the mechanical properties of the material.The hardness and strength of the deformed alloy are both significantly raised to 268 HV and 390.04 MPa by 10 turns HPT process,and the fracture morphology shows that the material gradually transits from brittle to plastic before and after deformation.The elements interdiffusion at the interface between the phases has also been effectively enhanced.In addition,it is found that the severe plastic deformation at room temperature induces a ternary eutectic reaction,resulting in the formation of ternary Al+Si+CuAl_(2)eutectic.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 12275354 and 11805272)the Civil Aviation University of China (Grant No. 3122023PT08)。
文摘The driven-dissipative Langevin dynamics simulation is used to produce a two-dimensional(2D) dense cloud, which is composed of charged dust particles trapped in a quadratic potential. A 2D mesh grid is built to analyze the center-to-wall dust density. It is found that the local dust density in the outer region relative to that of the inner region is more nonuniform,being consistent with the feature of quadratic potential. The dependences of the global dust density on equilibrium temperature, particle size, confinement strength, and confinement shape are investigated. It is found that the particle size, the confinement strength, and the confinement shape strongly affect the global dust density, while the equilibrium temperature plays a minor effect on it. In the direction where there is a stronger confinement, the dust density gradient is bigger.
基金financially supported by the Natural Science Foundation of China(Grant Nos.52372226,52173263,62004167)the Natural Science Basic Research Plan in Shaanxi Province of China(Grant Nos.2022JM-315,2023-JC-QN-0643)+4 种基金the National Key R&D Program of China(Grant No.2022YFB3603703)the Qinchuangyuan High-level Talent Project of Shaanxi(Grant No.QCYRCXM-2022-219)the Ningbo Natural Science Foundation(Grant No.2022J061)the Key Research and Development Program of Shaanxi(Grant No.2023GXLH-091)the Shccig-Qinling Program and the Fundamental Research Funds for the Central Universities。
文摘Two-dimensional Ruddlesden-Popper(2DRP)perovskite exhibits excellent stability in perovskite solar cells(PSCs)due to introducing hydrophobic long-chain organic spacers.However,the poor charge transporting property of bulky organic cation spacers limits the performance of 2DRP PSCs.Inspired by the Asite cation alloying strategy in 3D perovskites,2DRP perovskites with a binary spacer can promote charge transporting compared to the unary spacer counterparts.Herein,the superior MA-based 2DRP perovskite films with a binary spacer,including 3-guanidinopropanoic acid(GPA)and 4-fluorophenethylamine(FPEA)are realized.These films(GPA_(0.85)FPEA_(0.15))_(2)MA_(4)Pb_5I_(16)show good morphology,large grain size,decreased trap state density,and preferential orientation of the as-prepared film.Accordingly,the present 2DRP-based PSC with the binary spacer achieves a remarkable efficiency of 18.37%with a V_(OC)of1.15 V,a J_(SC)of 20.13 mA cm^(-2),and an FF of 79.23%.To our knowledge,the PCE value should be the highest for binary spacer MA-based 2DRP(n≤5)PSCs to date.Importantly,owing to the hydrophobic fluorine group of FPEA and the enhanced interlayer interaction by FPEA,the unencapsulated 2DRP PSCs based on binary spacers exhibit much excellent humidity stability and thermal stability than the unary spacer counterparts.
基金State Grid Corporation of China Science and Technology Project“Research andApplication of Key Technologies for Trusted Issuance and Security Control of Electronic Licenses for Power Business”(5700-202353318A-1-1-ZN).
文摘To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQuality of Service (QoS) requirements, revealing the inadequacies of traditional routing allocation mechanismsin accommodating such extensive data flows. In response to the imperative of handling a substantial influx of datarequests promptly and alleviating the constraints of existing technologies and network congestion, we present anarchitecture forQoS routing optimizationwith in SoftwareDefinedNetwork (SDN), leveraging deep reinforcementlearning. This innovative approach entails the separation of SDN control and transmission functionalities, centralizingcontrol over data forwardingwhile integrating deep reinforcement learning for informed routing decisions. Byfactoring in considerations such as delay, bandwidth, jitter rate, and packet loss rate, we design a reward function toguide theDeepDeterministic PolicyGradient (DDPG) algorithmin learning the optimal routing strategy to furnishsuperior QoS provision. In our empirical investigations, we juxtapose the performance of Deep ReinforcementLearning (DRL) against that of Shortest Path (SP) algorithms in terms of data packet transmission delay. Theexperimental simulation results show that our proposed algorithm has significant efficacy in reducing networkdelay and improving the overall transmission efficiency, which is superior to the traditional methods.
基金Project supported by the National Natural Science Foundation of China (Grant Nos.12274264 and 11674197)the Natural Science Foundation of Shandong Province of China (Grant Nos.ZR2022MA039 and ZR2021MA105)the Qing-Chuang Science and Technology Plan of Shandong Province of China (Grant No.2019KJJ014)。
文摘The anomalous valley Hall effect(AVHE)can be used to explore and utilize valley degrees of freedom in materials,which has potential applications in fields such as information storage,quantum computing and optoelectronics.AVHE exists in two-dimensional(2D)materials possessing valley polarization(VP),and such 2D materials usually belong to the hexagonal honeycomb lattice.Therefore,it is necessary to achieve valleytronic materials with VP that are more readily to be synthesized and applicated experimentally.In this topical review,we introduce recent developments on realizing VP as well as AVHE through different methods,i.e.,doping transition metal atoms,building ferrovalley heterostructures and searching for ferrovalley materials.Moreover,2D ferrovalley systems under external modulation are also discussed.2D valleytronic materials with AVHE demonstrate excellent performance and potential applications,which offer the possibility of realizing novel low-energy-consuming devices,facilitating further development of device technology,realizing miniaturization and enhancing functionality of them.
基金The research has been generously supported by Tianjin Education Commission Scientific Research Program(2020KJ056),ChinaTianjin Science and Technology Planning Project(22YDTPJC00970),China.The authors would like to express their sincere appreciation for all support provided.
文摘A real-time adaptive roles allocation method based on reinforcement learning is proposed to improve humanrobot cooperation performance for a curtain wall installation task.This method breaks the traditional idea that the robot is regarded as the follower or only adjusts the leader and the follower in cooperation.In this paper,a self-learning method is proposed which can dynamically adapt and continuously adjust the initiative weight of the robot according to the change of the task.Firstly,the physical human-robot cooperation model,including the role factor is built.Then,a reinforcement learningmodel that can adjust the role factor in real time is established,and a reward and actionmodel is designed.The role factor can be adjusted continuously according to the comprehensive performance of the human-robot interaction force and the robot’s Jerk during the repeated installation.Finally,the roles adjustment rule established above continuously improves the comprehensive performance.Experiments of the dynamic roles allocation and the effect of the performance weighting coefficient on the result have been verified.The results show that the proposed method can realize the role adaptation and achieve the dual optimization goal of reducing the sum of the cooperator force and the robot’s Jerk.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.12374202 and 12004001)Anhui Projects(Grant Nos.2022AH020009,S020218016,and Z010118169)+1 种基金Hefei City(Grant No.Z020132009)Anhui University(start-up fund)。
文摘One hallmark of glasses is the existence of excess vibrational modes at low frequenciesωbeyond Debye’s prediction.Numerous studies suggest that understanding low-frequency excess vibrations could help gain insight into the anomalous mechanical and thermodynamic properties of glasses.However,there is still intensive debate as to the frequency dependence of the population of low-frequency excess vibrations.In particular,excess modes could hybridize with phonon-like modes and the density of hybridized excess modes has been reported to follow D_(exc)(ω)~ω^(2)in 2D glasses with an inverse power law potential.Yet,the universality of the quadratic scaling remains unknown,since recent work suggested that interaction potentials could influence the scaling of the vibrational spectrum.Here,we extend the universality of the quadratic scaling for hybridized excess modes in 2D to glasses with potentials ranging from the purely repulsive soft-core interaction to the hard-core one with both repulsion and attraction as well as to glasses with significant differences in density or interparticle repulsion.Moreover,we observe that the number of hybridized excess modes exhibits a decrease in glasses with higher density or steeper interparticle repulsion,which is accompanied by a suppression of the strength of the sound attenuation.Our results indicate that the density bears some resemblance to the repulsive steepness of the interaction in influencing low-frequency properties.
基金supported in part by the Natural Sciences Engineering Research Council of Canada (NSERC)。
文摘This survey paper provides a review and perspective on intermediate and advanced reinforcement learning(RL)techniques in process industries. It offers a holistic approach by covering all levels of the process control hierarchy. The survey paper presents a comprehensive overview of RL algorithms,including fundamental concepts like Markov decision processes and different approaches to RL, such as value-based, policy-based, and actor-critic methods, while also discussing the relationship between classical control and RL. It further reviews the wide-ranging applications of RL in process industries, such as soft sensors, low-level control, high-level control, distributed process control, fault detection and fault tolerant control, optimization,planning, scheduling, and supply chain. The survey paper discusses the limitations and advantages, trends and new applications, and opportunities and future prospects for RL in process industries. Moreover, it highlights the need for a holistic approach in complex systems due to the growing importance of digitalization in the process industries.
基金supported by National Natural Science Foundation of China(No.61871283)the Foundation of Pre-Research on Equipment of China(No.61400010304)Major Civil-Military Integration Project in Tianjin,China(No.18ZXJMTG00170).
文摘The development of communication technology will promote the application of Internet of Things,and Beyond 5G will become a new technology promoter.At the same time,Beyond 5G will become one of the important supports for the development of edge computing technology.This paper proposes a communication task allocation algorithm based on deep reinforcement learning for vehicle-to-pedestrian communication scenarios in edge computing.Through trial and error learning of agent,the optimal spectrum and power can be determined for transmission without global information,so as to balance the communication between vehicle-to-pedestrian and vehicle-to-infrastructure.The results show that the agent can effectively improve vehicle-to-infrastructure communication rate as well as meeting the delay constraints on the vehicle-to-pedestrian link.
基金supported in part by the Start-Up Grant-Nanyang Assistant Professorship Grant of Nanyang Technological Universitythe Agency for Science,Technology and Research(A*STAR)under Advanced Manufacturing and Engineering(AME)Young Individual Research under Grant(A2084c0156)+2 种基金the MTC Individual Research Grant(M22K2c0079)the ANR-NRF Joint Grant(NRF2021-NRF-ANR003 HM Science)the Ministry of Education(MOE)under the Tier 2 Grant(MOE-T2EP50222-0002)。
文摘While autonomous vehicles are vital components of intelligent transportation systems,ensuring the trustworthiness of decision-making remains a substantial challenge in realizing autonomous driving.Therefore,we present a novel robust reinforcement learning approach with safety guarantees to attain trustworthy decision-making for autonomous vehicles.The proposed technique ensures decision trustworthiness in terms of policy robustness and collision safety.Specifically,an adversary model is learned online to simulate the worst-case uncertainty by approximating the optimal adversarial perturbations on the observed states and environmental dynamics.In addition,an adversarial robust actor-critic algorithm is developed to enable the agent to learn robust policies against perturbations in observations and dynamics.Moreover,we devise a safety mask to guarantee the collision safety of the autonomous driving agent during both the training and testing processes using an interpretable knowledge model known as the Responsibility-Sensitive Safety Model.Finally,the proposed approach is evaluated through both simulations and experiments.These results indicate that the autonomous driving agent can make trustworthy decisions and drastically reduce the number of collisions through robust safety policies.