With the increasing maturity of automated guided vehicles(AGV)technology and the widespread application of flexible manufacturing systems,enhancing the efficiency of AGVs in complex environments has become crucial.Thi...With the increasing maturity of automated guided vehicles(AGV)technology and the widespread application of flexible manufacturing systems,enhancing the efficiency of AGVs in complex environments has become crucial.This paper analyzes the challenges of path planning and scheduling in multi-AGV systems,introduces a map-based path search algorithm,and proposes the BFS algorithm for shortest path planning.Through optimization using the breadth-first search(BFS)algorithm,efficient scheduling of multiple AGVs in complex environments is achieved.In addition,this paper validated the effectiveness of the proposed method in a production workshop experiment.The experimental results show that the BFS algorithm can quickly search for the shortest path,reduce the running time of AGVs,and significantly improve the performance of multi-AGV scheduling systems.展开更多
Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus o...Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus on enabling congestion control to minimize network transmission delays through flexible power control.To effectively solve the congestion problem,we propose a distributed cross-layer scheduling algorithm,which is empowered by graph-based multi-agent deep reinforcement learning.The transmit power is adaptively adjusted in real-time by our algorithm based only on local information(i.e.,channel state information and queue length)and local communication(i.e.,information exchanged with neighbors).Moreover,the training complexity of the algorithm is low due to the regional cooperation based on the graph attention network.In the evaluation,we show that our algorithm can reduce the transmission delay of data flow under severe signal interference and drastically changing channel states,and demonstrate the adaptability and stability in different topologies.The method is general and can be extended to various types of topologies.展开更多
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers...Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.展开更多
This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eli...This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.展开更多
Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduli...Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm.展开更多
Cloud computing provides a diverse and adaptable resource pool over the internet,allowing users to tap into various resources as needed.It has been seen as a robust solution to relevant challenges.A significant delay ...Cloud computing provides a diverse and adaptable resource pool over the internet,allowing users to tap into various resources as needed.It has been seen as a robust solution to relevant challenges.A significant delay can hamper the performance of IoT-enabled cloud platforms.However,efficient task scheduling can lower the cloud infrastructure’s energy consumption,thus maximizing the service provider’s revenue by decreasing user job processing times.The proposed Modified Chimp-Whale Optimization Algorithm called Modified Chimp-Whale Optimization Algorithm(MCWOA),combines elements of the Chimp Optimization Algorithm(COA)and the Whale Optimization Algorithm(WOA).To enhance MCWOA’s identification precision,the Sobol sequence is used in the population initialization phase,ensuring an even distribution of the population across the solution space.Moreover,the traditional MCWOA’s local search capabilities are augmented by incorporating the whale optimization algorithm’s bubble-net hunting and random search mechanisms into MCWOA’s position-updating process.This study demonstrates the effectiveness of the proposed approach using a two-story rigid frame and a simply supported beam model.Simulated outcomes reveal that the new method outperforms the original MCWOA,especially in multi-damage detection scenarios.MCWOA excels in avoiding false positives and enhancing computational speed,making it an optimal choice for structural damage detection.The efficiency of the proposed MCWOA is assessed against metrics such as energy usage,computational expense,task duration,and delay.The simulated data indicates that the new MCWOA outpaces other methods across all metrics.The study also references the Whale Optimization Algorithm(WOA),Chimp Algorithm(CA),Ant Lion Optimizer(ALO),Genetic Algorithm(GA)and Grey Wolf Optimizer(GWO).展开更多
Time-Sensitive Network(TSN)with deterministic transmission capability is increasingly used in many emerging fields.It mainly guarantees the Quality of Service(QoS)of applications with strict requirements on time and s...Time-Sensitive Network(TSN)with deterministic transmission capability is increasingly used in many emerging fields.It mainly guarantees the Quality of Service(QoS)of applications with strict requirements on time and security.One of the core features of TSN is traffic scheduling with bounded low delay in the network.However,traffic scheduling schemes in TSN are usually synthesized offline and lack dynamism.To implement incremental scheduling of newly arrived traffic in TSN,we propose a Dynamic Response Incremental Scheduling(DR-IS)method for time-sensitive traffic and deploy it on a software-defined time-sensitive network architecture.Under the premise of meeting the traffic scheduling requirements,we adopt two modes,traffic shift and traffic exchange,to dynamically adjust the time slot injection position of the traffic in the original scheme,and determine the sending offset time of the new timesensitive traffic to minimize the global traffic transmission jitter.The evaluation results show that DRIS method can effectively control the large increase of traffic transmission jitter in incremental scheduling without affecting the transmission delay,thus realizing the dynamic incremental scheduling of time-sensitive traffic in TSN.展开更多
Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that ...Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable(MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience.Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms.展开更多
Large-scale indoor 3D reconstruction with multiple robots faces challenges in core enabling technologies.This work contributes to a framework addressing localization,coordination,and vision processing for multi-agent ...Large-scale indoor 3D reconstruction with multiple robots faces challenges in core enabling technologies.This work contributes to a framework addressing localization,coordination,and vision processing for multi-agent reconstruction.A system architecture fusing visible light positioning,multi-agent path finding via reinforcement learning,and 360°camera techniques for 3D reconstruction is proposed.Our visible light positioning algorithm leverages existing lighting for centimeter-level localization without additional infrastructure.Meanwhile,a decentralized reinforcement learning approach is developed to solve the multi-agent path finding problem,with communications among agents optimized.Our 3D reconstruction pipeline utilizes equirectangular projection from 360°cameras to facilitate depth-independent reconstruction from posed monocular images using neural networks.Experimental validation demonstrates centimeter-level indoor navigation and 3D scene reconstruction capabilities of our framework.The challenges and limitations stemming from the above enabling technologies are discussed at the end of each corresponding section.In summary,this research advances fundamental techniques for multi-robot indoor 3D modeling,contributing to automated,data-driven applications through coordinated robot navigation,perception,and modeling.展开更多
Multi-agent reinforcement learning(MARL)has been a rapidly evolving field.This paper presents a comprehensive survey of MARL and its applications.We trace the historical evolution of MARL,highlight its progress,and di...Multi-agent reinforcement learning(MARL)has been a rapidly evolving field.This paper presents a comprehensive survey of MARL and its applications.We trace the historical evolution of MARL,highlight its progress,and discuss related survey works.Then,we review the existing works addressing inherent challenges and those focusing on diverse applications.Some representative stochastic games,MARL means,spatial forms of MARL,and task classification are revisited.We then conduct an in-depth exploration of a variety of challenges encountered in MARL applications.We also address critical operational aspects,such as hyperparameter tuning and computational complexity,which are pivotal in practical implementations of MARL.Afterward,we make a thorough overview of the applications of MARL to intelligent machines and devices,chemical engineering,biotechnology,healthcare,and societal issues,which highlights the extensive potential and relevance of MARL within both current and future technological contexts.Our survey also encompasses a detailed examination of benchmark environments used in MARL research,which are instrumental in evaluating MARL algorithms and demonstrate the adaptability of MARL to diverse application scenarios.In the end,we give our prospect for MARL and discuss their related techniques and potential future applications.展开更多
The emergence of beyond 5G networks has the potential for seamless and intelligent connectivity on a global scale.Network slicing is crucial in delivering services for different,demanding vertical applications in this...The emergence of beyond 5G networks has the potential for seamless and intelligent connectivity on a global scale.Network slicing is crucial in delivering services for different,demanding vertical applications in this context.Next-generation applications have time-sensitive requirements and depend on the most efficient routing path to ensure packets reach their intended destinations.However,the existing IP(Internet Protocol)over a multi-domain network faces challenges in enforcing network slicing due to minimal collaboration and information sharing among network operators.Conventional inter-domain routing methods,like Border Gateway Protocol(BGP),cannot make routing decisions based on performance,which frequently results in traffic flowing across congested paths that are never optimal.To address these issues,we propose CoopAI-Route,a multi-agent cooperative deep reinforcement learning(DRL)system utilizing hierarchical software-defined networks(SDN).This framework enforces network slicing in multi-domain networks and cooperative communication with various administrators to find performance-based routes in intra-and inter-domain.CoopAI-Route employs the Distributed Global Topology(DGT)algorithm to define inter-domain Quality of Service(QoS)paths.CoopAI-Route uses a DRL agent with a message-passing multi-agent Twin-Delayed Deep Deterministic Policy Gradient method to ensure optimal end-to-end routes adapted to the specific requirements of network slicing applications.Our evaluation demonstrates CoopAI-Route’s commendable performance in scalability,link failure handling,and adaptability to evolving topologies compared to state-of-the-art methods.展开更多
The distributed flexible job shop scheduling problem(DFJSP)has attracted great attention with the growth of the global manufacturing industry.General DFJSP research only considers machine constraints and ignores worke...The distributed flexible job shop scheduling problem(DFJSP)has attracted great attention with the growth of the global manufacturing industry.General DFJSP research only considers machine constraints and ignores worker constraints.As one critical factor of production,effective utilization of worker resources can increase productivity.Meanwhile,energy consumption is a growing concern due to the increasingly serious environmental issues.Therefore,the distributed flexible job shop scheduling problem with dual resource constraints(DFJSP-DRC)for minimizing makespan and total energy consumption is studied in this paper.To solve the problem,we present a multi-objective mathematical model for DFJSP-DRC and propose a Q-learning-based multi-objective grey wolf optimizer(Q-MOGWO).In Q-MOGWO,high-quality initial solutions are generated by a hybrid initialization strategy,and an improved active decoding strategy is designed to obtain the scheduling schemes.To further enhance the local search capability and expand the solution space,two wolf predation strategies and three critical factory neighborhood structures based on Q-learning are proposed.These strategies and structures enable Q-MOGWO to explore the solution space more efficiently and thus find better Pareto solutions.The effectiveness of Q-MOGWO in addressing DFJSP-DRC is verified through comparison with four algorithms using 45 instances.The results reveal that Q-MOGWO outperforms comparison algorithms in terms of solution quality.展开更多
In recent years, target tracking has been considered one of the most important applications of wireless sensornetwork (WSN). Optimizing target tracking performance and prolonging network lifetime are two equally criti...In recent years, target tracking has been considered one of the most important applications of wireless sensornetwork (WSN). Optimizing target tracking performance and prolonging network lifetime are two equally criticalobjectives in this scenario. The existing mechanisms still have weaknesses in balancing the two demands. Theproposed heuristic multi-node collaborative scheduling mechanism (HMNCS) comprises cluster head (CH)election, pre-selection, and task set selectionmechanisms, where the latter two kinds of selections forma two-layerselection mechanism. The CH election innovatively introduces the movement trend of the target and establishesa scoring mechanism to determine the optimal CH, which can delay the CH rotation and thus reduce energyconsumption. The pre-selection mechanism adaptively filters out suitable nodes as the candidate task set to applyfor tracking tasks, which can reduce the application consumption and the overhead of the following task setselection. Finally, the task node selection is mathematically transformed into an optimization problem and thegenetic algorithm is adopted to form a final task set in the task set selection mechanism. Simulation results showthat HMNCS outperforms other compared mechanisms in the tracking accuracy and the network lifetime.展开更多
As cloud quantum computing gains broader acceptance,a growing quantity of researchers are directing their focus towards this domain.Nevertheless,the rapid surge in demand for cloud-based quantum computing resources ha...As cloud quantum computing gains broader acceptance,a growing quantity of researchers are directing their focus towards this domain.Nevertheless,the rapid surge in demand for cloud-based quantum computing resources has led to a scarcity,which in turn hampers users from achieving optimal satisfaction.Therefore,cloud quantum computing service providers require a unified analysis and scheduling framework for their quantumresources and user jobs to meet the ever-growing usage demands.This paper introduces a new multi-programming scheduling framework for quantum computing in a cloud environment.The framework addresses the issue of limited quantum computing resources in cloud environments and ensures a satisfactory user experience.It introduces three innovative designs:1)Our framework automatically allocates tasks to different quantum backends while ensuring fairness among users by considering both the cloud-based quantum resources and the user-submitted tasks.2)Multi-programming mechanism is employed across different quantum backends to enhance the overall throughput of the quantum cloud.In comparison to conventional task schedulers,our proposed framework achieves a throughput improvement of more than two-fold in the quantum cloud.3)The framework can balance fidelity and user waiting time by adaptively adjusting scheduling parameters.展开更多
This paper examines the bipartite consensus problems for the nonlinear multi-agent systems in Lurie dynamics form with cooperative and competitive communication between different agents. Based on the contraction theor...This paper examines the bipartite consensus problems for the nonlinear multi-agent systems in Lurie dynamics form with cooperative and competitive communication between different agents. Based on the contraction theory, some new conditions for the nonlinear Lurie multi-agent systems reaching bipartite leaderless consensus and bipartite tracking consensus are presented. Compared with the traditional methods, this approach degrades the dimensions of the conditions, eliminates some restrictions of the system matrix, and extends the range of the nonlinear function. Finally, two numerical examples are provided to illustrate the efficiency of our results.展开更多
This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global ...This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global finite-time consensus for both single-integrator and double-integrator multi-agent systems with leaderless undirected and leader-following directed commu-nication topologies.These new protocols not only provide an explicit upper-bound estimate for the settling time,but also have a user-prescribed bounded control level.In addition,compared to some existing results based on the saturation function,the pro-posed approach considerably simplifies the protocol design and the stability analysis.Illustrative examples and an application demonstrate the effectiveness of the proposed protocols.展开更多
This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consens...This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems.展开更多
In current research on task offloading and resource scheduling in vehicular networks,vehicles are commonly assumed to maintain constant speed or relatively stationary states,and the impact of speed variations on task ...In current research on task offloading and resource scheduling in vehicular networks,vehicles are commonly assumed to maintain constant speed or relatively stationary states,and the impact of speed variations on task offloading is often overlooked.It is frequently assumed that vehicles can be accurately modeled during actual motion processes.However,in vehicular dynamic environments,both the tasks generated by the vehicles and the vehicles’surroundings are constantly changing,making it difficult to achieve real-time modeling for actual dynamic vehicular network scenarios.Taking into account the actual dynamic vehicular scenarios,this paper considers the real-time non-uniform movement of vehicles and proposes a vehicular task dynamic offloading and scheduling algorithm for single-task multi-vehicle vehicular network scenarios,attempting to solve the dynamic decision-making problem in task offloading process.The optimization objective is to minimize the average task completion time,which is formulated as a multi-constrained non-linear programming problem.Due to the mobility of vehicles,a constraint model is applied in the decision-making process to dynamically determine whether the communication range is sufficient for task offloading and transmission.Finally,the proposed vehicular task dynamic offloading and scheduling algorithm based on muti-agent deep deterministic policy gradient(MADDPG)is applied to solve the optimal solution of the optimization problem.Simulation results show that the algorithm proposed in this paper is able to achieve lower latency task computation offloading.Meanwhile,the average task completion time of the proposed algorithm in this paper can be improved by 7.6%compared to the performance of the MADDPG scheme and 51.1%compared to the performance of deep deterministic policy gradient(DDPG).展开更多
文摘With the increasing maturity of automated guided vehicles(AGV)technology and the widespread application of flexible manufacturing systems,enhancing the efficiency of AGVs in complex environments has become crucial.This paper analyzes the challenges of path planning and scheduling in multi-AGV systems,introduces a map-based path search algorithm,and proposes the BFS algorithm for shortest path planning.Through optimization using the breadth-first search(BFS)algorithm,efficient scheduling of multiple AGVs in complex environments is achieved.In addition,this paper validated the effectiveness of the proposed method in a production workshop experiment.The experimental results show that the BFS algorithm can quickly search for the shortest path,reduce the running time of AGVs,and significantly improve the performance of multi-AGV scheduling systems.
基金supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.RS-2022-00155885, Artificial Intelligence Convergence Innovation Human Resources Development (Hanyang University ERICA))supported by the National Natural Science Foundation of China under Grant No. 61971264the National Natural Science Foundation of China/Research Grants Council Collaborative Research Scheme under Grant No. 62261160390
文摘Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus on enabling congestion control to minimize network transmission delays through flexible power control.To effectively solve the congestion problem,we propose a distributed cross-layer scheduling algorithm,which is empowered by graph-based multi-agent deep reinforcement learning.The transmit power is adaptively adjusted in real-time by our algorithm based only on local information(i.e.,channel state information and queue length)and local communication(i.e.,information exchanged with neighbors).Moreover,the training complexity of the algorithm is low due to the regional cooperation based on the graph attention network.In the evaluation,we show that our algorithm can reduce the transmission delay of data flow under severe signal interference and drastically changing channel states,and demonstrate the adaptability and stability in different topologies.The method is general and can be extended to various types of topologies.
基金supported in part by NSFC (62102099, U22A2054, 62101594)in part by the Pearl River Talent Recruitment Program (2021QN02S643)+9 种基金Guangzhou Basic Research Program (2023A04J1699)in part by the National Research Foundation, SingaporeInfocomm Media Development Authority under its Future Communications Research Development ProgrammeDSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programmeDesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programmeMOE Tier 1 under Grant RG87/22in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165)in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)。
文摘Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
基金the National Natural Science Foundation of China(62203356)Fundamental Research Funds for the Central Universities of China(31020210502002)。
文摘This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.
文摘Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm.
文摘Cloud computing provides a diverse and adaptable resource pool over the internet,allowing users to tap into various resources as needed.It has been seen as a robust solution to relevant challenges.A significant delay can hamper the performance of IoT-enabled cloud platforms.However,efficient task scheduling can lower the cloud infrastructure’s energy consumption,thus maximizing the service provider’s revenue by decreasing user job processing times.The proposed Modified Chimp-Whale Optimization Algorithm called Modified Chimp-Whale Optimization Algorithm(MCWOA),combines elements of the Chimp Optimization Algorithm(COA)and the Whale Optimization Algorithm(WOA).To enhance MCWOA’s identification precision,the Sobol sequence is used in the population initialization phase,ensuring an even distribution of the population across the solution space.Moreover,the traditional MCWOA’s local search capabilities are augmented by incorporating the whale optimization algorithm’s bubble-net hunting and random search mechanisms into MCWOA’s position-updating process.This study demonstrates the effectiveness of the proposed approach using a two-story rigid frame and a simply supported beam model.Simulated outcomes reveal that the new method outperforms the original MCWOA,especially in multi-damage detection scenarios.MCWOA excels in avoiding false positives and enhancing computational speed,making it an optimal choice for structural damage detection.The efficiency of the proposed MCWOA is assessed against metrics such as energy usage,computational expense,task duration,and delay.The simulated data indicates that the new MCWOA outpaces other methods across all metrics.The study also references the Whale Optimization Algorithm(WOA),Chimp Algorithm(CA),Ant Lion Optimizer(ALO),Genetic Algorithm(GA)and Grey Wolf Optimizer(GWO).
基金supported by the Innovation Scientists and Technicians Troop Construction Projects of Henan Province(224000510002)。
文摘Time-Sensitive Network(TSN)with deterministic transmission capability is increasingly used in many emerging fields.It mainly guarantees the Quality of Service(QoS)of applications with strict requirements on time and security.One of the core features of TSN is traffic scheduling with bounded low delay in the network.However,traffic scheduling schemes in TSN are usually synthesized offline and lack dynamism.To implement incremental scheduling of newly arrived traffic in TSN,we propose a Dynamic Response Incremental Scheduling(DR-IS)method for time-sensitive traffic and deploy it on a software-defined time-sensitive network architecture.Under the premise of meeting the traffic scheduling requirements,we adopt two modes,traffic shift and traffic exchange,to dynamically adjust the time slot injection position of the traffic in the original scheme,and determine the sending offset time of the new timesensitive traffic to minimize the global traffic transmission jitter.The evaluation results show that DRIS method can effectively control the large increase of traffic transmission jitter in incremental scheduling without affecting the transmission delay,thus realizing the dynamic incremental scheduling of time-sensitive traffic in TSN.
基金supported in part by the National Natural Science Foundation of China (62136008,62236002,61921004,62173251,62103104)the “Zhishan” Scholars Programs of Southeast Universitythe Fundamental Research Funds for the Central Universities (2242023K30034)。
文摘Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable(MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience.Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms.
基金supported by Bright Dream Robotics and the HKUSTBDR Joint Research Institute Funding Scheme under Project HBJRI-FTP-005(Automated 3D Reconstruction using Robot-mounted 360-Degree Camera with Visible Light Positioning Technology for Building Information Modelling Applications,OKT22EG06).
文摘Large-scale indoor 3D reconstruction with multiple robots faces challenges in core enabling technologies.This work contributes to a framework addressing localization,coordination,and vision processing for multi-agent reconstruction.A system architecture fusing visible light positioning,multi-agent path finding via reinforcement learning,and 360°camera techniques for 3D reconstruction is proposed.Our visible light positioning algorithm leverages existing lighting for centimeter-level localization without additional infrastructure.Meanwhile,a decentralized reinforcement learning approach is developed to solve the multi-agent path finding problem,with communications among agents optimized.Our 3D reconstruction pipeline utilizes equirectangular projection from 360°cameras to facilitate depth-independent reconstruction from posed monocular images using neural networks.Experimental validation demonstrates centimeter-level indoor navigation and 3D scene reconstruction capabilities of our framework.The challenges and limitations stemming from the above enabling technologies are discussed at the end of each corresponding section.In summary,this research advances fundamental techniques for multi-robot indoor 3D modeling,contributing to automated,data-driven applications through coordinated robot navigation,perception,and modeling.
基金Ministry of Education,Singapore,under AcRF TIER 1 Grant RG64/23the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship,a Schmidt Futures program,USA.
文摘Multi-agent reinforcement learning(MARL)has been a rapidly evolving field.This paper presents a comprehensive survey of MARL and its applications.We trace the historical evolution of MARL,highlight its progress,and discuss related survey works.Then,we review the existing works addressing inherent challenges and those focusing on diverse applications.Some representative stochastic games,MARL means,spatial forms of MARL,and task classification are revisited.We then conduct an in-depth exploration of a variety of challenges encountered in MARL applications.We also address critical operational aspects,such as hyperparameter tuning and computational complexity,which are pivotal in practical implementations of MARL.Afterward,we make a thorough overview of the applications of MARL to intelligent machines and devices,chemical engineering,biotechnology,healthcare,and societal issues,which highlights the extensive potential and relevance of MARL within both current and future technological contexts.Our survey also encompasses a detailed examination of benchmark environments used in MARL research,which are instrumental in evaluating MARL algorithms and demonstrate the adaptability of MARL to diverse application scenarios.In the end,we give our prospect for MARL and discuss their related techniques and potential future applications.
文摘The emergence of beyond 5G networks has the potential for seamless and intelligent connectivity on a global scale.Network slicing is crucial in delivering services for different,demanding vertical applications in this context.Next-generation applications have time-sensitive requirements and depend on the most efficient routing path to ensure packets reach their intended destinations.However,the existing IP(Internet Protocol)over a multi-domain network faces challenges in enforcing network slicing due to minimal collaboration and information sharing among network operators.Conventional inter-domain routing methods,like Border Gateway Protocol(BGP),cannot make routing decisions based on performance,which frequently results in traffic flowing across congested paths that are never optimal.To address these issues,we propose CoopAI-Route,a multi-agent cooperative deep reinforcement learning(DRL)system utilizing hierarchical software-defined networks(SDN).This framework enforces network slicing in multi-domain networks and cooperative communication with various administrators to find performance-based routes in intra-and inter-domain.CoopAI-Route employs the Distributed Global Topology(DGT)algorithm to define inter-domain Quality of Service(QoS)paths.CoopAI-Route uses a DRL agent with a message-passing multi-agent Twin-Delayed Deep Deterministic Policy Gradient method to ensure optimal end-to-end routes adapted to the specific requirements of network slicing applications.Our evaluation demonstrates CoopAI-Route’s commendable performance in scalability,link failure handling,and adaptability to evolving topologies compared to state-of-the-art methods.
基金supported by the Natural Science Foundation of Anhui Province(Grant Number 2208085MG181)the Science Research Project of Higher Education Institutions in Anhui Province,Philosophy and Social Sciences(Grant Number 2023AH051063)the Open Fund of Key Laboratory of Anhui Higher Education Institutes(Grant Number CS2021-ZD01).
文摘The distributed flexible job shop scheduling problem(DFJSP)has attracted great attention with the growth of the global manufacturing industry.General DFJSP research only considers machine constraints and ignores worker constraints.As one critical factor of production,effective utilization of worker resources can increase productivity.Meanwhile,energy consumption is a growing concern due to the increasingly serious environmental issues.Therefore,the distributed flexible job shop scheduling problem with dual resource constraints(DFJSP-DRC)for minimizing makespan and total energy consumption is studied in this paper.To solve the problem,we present a multi-objective mathematical model for DFJSP-DRC and propose a Q-learning-based multi-objective grey wolf optimizer(Q-MOGWO).In Q-MOGWO,high-quality initial solutions are generated by a hybrid initialization strategy,and an improved active decoding strategy is designed to obtain the scheduling schemes.To further enhance the local search capability and expand the solution space,two wolf predation strategies and three critical factory neighborhood structures based on Q-learning are proposed.These strategies and structures enable Q-MOGWO to explore the solution space more efficiently and thus find better Pareto solutions.The effectiveness of Q-MOGWO in addressing DFJSP-DRC is verified through comparison with four algorithms using 45 instances.The results reveal that Q-MOGWO outperforms comparison algorithms in terms of solution quality.
基金the Project Program of Science and Technology on Micro-System Laboratory,No.6142804220101.
文摘In recent years, target tracking has been considered one of the most important applications of wireless sensornetwork (WSN). Optimizing target tracking performance and prolonging network lifetime are two equally criticalobjectives in this scenario. The existing mechanisms still have weaknesses in balancing the two demands. Theproposed heuristic multi-node collaborative scheduling mechanism (HMNCS) comprises cluster head (CH)election, pre-selection, and task set selectionmechanisms, where the latter two kinds of selections forma two-layerselection mechanism. The CH election innovatively introduces the movement trend of the target and establishesa scoring mechanism to determine the optimal CH, which can delay the CH rotation and thus reduce energyconsumption. The pre-selection mechanism adaptively filters out suitable nodes as the candidate task set to applyfor tracking tasks, which can reduce the application consumption and the overhead of the following task setselection. Finally, the task node selection is mathematically transformed into an optimization problem and thegenetic algorithm is adopted to form a final task set in the task set selection mechanism. Simulation results showthat HMNCS outperforms other compared mechanisms in the tracking accuracy and the network lifetime.
文摘As cloud quantum computing gains broader acceptance,a growing quantity of researchers are directing their focus towards this domain.Nevertheless,the rapid surge in demand for cloud-based quantum computing resources has led to a scarcity,which in turn hampers users from achieving optimal satisfaction.Therefore,cloud quantum computing service providers require a unified analysis and scheduling framework for their quantumresources and user jobs to meet the ever-growing usage demands.This paper introduces a new multi-programming scheduling framework for quantum computing in a cloud environment.The framework addresses the issue of limited quantum computing resources in cloud environments and ensures a satisfactory user experience.It introduces three innovative designs:1)Our framework automatically allocates tasks to different quantum backends while ensuring fairness among users by considering both the cloud-based quantum resources and the user-submitted tasks.2)Multi-programming mechanism is employed across different quantum backends to enhance the overall throughput of the quantum cloud.In comparison to conventional task schedulers,our proposed framework achieves a throughput improvement of more than two-fold in the quantum cloud.3)The framework can balance fidelity and user waiting time by adaptively adjusting scheduling parameters.
基金Project supported by the National Natural Science Foundation of China(Grant No.62363005)the Jiangxi Provincial Natural Science Foundation(Grant Nos.20161BAB212032 and 20232BAB202034)the Science and Technology Research Project of Jiangxi Provincial Department of Education(Grant Nos.GJJ202602 and GJJ202601)。
文摘This paper examines the bipartite consensus problems for the nonlinear multi-agent systems in Lurie dynamics form with cooperative and competitive communication between different agents. Based on the contraction theory, some new conditions for the nonlinear Lurie multi-agent systems reaching bipartite leaderless consensus and bipartite tracking consensus are presented. Compared with the traditional methods, this approach degrades the dimensions of the conditions, eliminates some restrictions of the system matrix, and extends the range of the nonlinear function. Finally, two numerical examples are provided to illustrate the efficiency of our results.
基金supported by the National Natural Science Foundation of China(62073019)。
文摘This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global finite-time consensus for both single-integrator and double-integrator multi-agent systems with leaderless undirected and leader-following directed commu-nication topologies.These new protocols not only provide an explicit upper-bound estimate for the settling time,but also have a user-prescribed bounded control level.In addition,compared to some existing results based on the saturation function,the pro-posed approach considerably simplifies the protocol design and the stability analysis.Illustrative examples and an application demonstrate the effectiveness of the proposed protocols.
基金supported in part by the National Natural Science Foundation of China (NSFC)(61703086, 61773106)the IAPI Fundamental Research Funds (2018ZCX27)
文摘This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems.
文摘In current research on task offloading and resource scheduling in vehicular networks,vehicles are commonly assumed to maintain constant speed or relatively stationary states,and the impact of speed variations on task offloading is often overlooked.It is frequently assumed that vehicles can be accurately modeled during actual motion processes.However,in vehicular dynamic environments,both the tasks generated by the vehicles and the vehicles’surroundings are constantly changing,making it difficult to achieve real-time modeling for actual dynamic vehicular network scenarios.Taking into account the actual dynamic vehicular scenarios,this paper considers the real-time non-uniform movement of vehicles and proposes a vehicular task dynamic offloading and scheduling algorithm for single-task multi-vehicle vehicular network scenarios,attempting to solve the dynamic decision-making problem in task offloading process.The optimization objective is to minimize the average task completion time,which is formulated as a multi-constrained non-linear programming problem.Due to the mobility of vehicles,a constraint model is applied in the decision-making process to dynamically determine whether the communication range is sufficient for task offloading and transmission.Finally,the proposed vehicular task dynamic offloading and scheduling algorithm based on muti-agent deep deterministic policy gradient(MADDPG)is applied to solve the optimal solution of the optimization problem.Simulation results show that the algorithm proposed in this paper is able to achieve lower latency task computation offloading.Meanwhile,the average task completion time of the proposed algorithm in this paper can be improved by 7.6%compared to the performance of the MADDPG scheme and 51.1%compared to the performance of deep deterministic policy gradient(DDPG).