期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach 被引量:1
1
作者 Jiawen Kang Junlong Chen +6 位作者 Minrui Xu Zehui Xiong Yutao Jiao Luchao Han Dusit Niyato Yongju Tong Shengli Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第2期430-445,共16页
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers... Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses. 展开更多
关键词 AVATAR blockchain metaverses multi-agent deep reinforcement learning transformer UAVS
下载PDF
Diverse Deep Matrix Factorization With Hypergraph Regularization for Multi-View Data Representation
2
作者 Haonan Huang Guoxu Zhou +2 位作者 Naiyao Liang Qibin Zhao Shengli Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第11期2154-2167,共14页
Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency o... Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency of multi-view data,while neglecting the diversity among different views as well as the high-order relationships of data,resulting in the loss of valuable complementary information.In this paper,we design a hypergraph regularized diverse deep matrix factorization(HDDMF)model for multi-view data representation,to jointly utilize multi-view diversity and a high-order manifold in a multilayer factorization framework.A novel diversity enhancement term is designed to exploit the structural complementarity between different views of data.Hypergraph regularization is utilized to preserve the high-order geometry structure of data in each view.An efficient iterative optimization algorithm is developed to solve the proposed model with theoretical convergence analysis.Experimental results on five real-world data sets demonstrate that the proposed method significantly outperforms stateof-the-art multi-view learning approaches. 展开更多
关键词 Deep matrix factorization(DMF) diversity hypergraph regularization multi-view data representation(MDR)
下载PDF
Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation
3
作者 QIN Yahang ZHANG Chengye +2 位作者 CHEN Ci XIE Shengli LEWIS Frank L 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2024年第1期114-135,共22页
This paper presents a learning-based control policy design for point-to-point vehicle positioning in the urban environment via BeiDou navigation.While navigating in urban canyons,the multipath effect is a kind of inte... This paper presents a learning-based control policy design for point-to-point vehicle positioning in the urban environment via BeiDou navigation.While navigating in urban canyons,the multipath effect is a kind of interference that causes the navigation signal to drift and thus imposes severe impacts on vehicle localization due to the reflection and diffraction of the BeiDou signal.Here,the authors formulated the navigation control system with unknown vehicle dynamics into an optimal control-seeking problem through a linear discrete-time system,and the point-to-point localization control is modeled and handled by leveraging off-policy reinforcement learning for feedback control.The proposed learning-based design guarantees optimality with prescribed performance and also stabilizes the closed-loop navigation system,without the full knowledge of the vehicle dynamics.It is seen that the proposed method can withstand the impact of the multipath effect while satisfying the prescribed convergence rate.A case study demonstrates that the proposed algorithms effectively drive the vehicle to a desired setpoint under the multipath effect introduced by actual experiments of BeiDou navigation in the urban environment. 展开更多
关键词 BeiDou navigation multipath effect prescribed convergence rate reinforcement learning urban localization.
原文传递
Learning the continuous-time optimal decision law from discrete-time rewards
4
作者 Ci Chen Lihua Xie +3 位作者 Kan Xie Frank Leroy Lewis Yilu Liu Shengli Xie 《National Science Open》 2024年第5期130-147,共18页
The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's ... The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision. 展开更多
关键词 continuous-time state and action decision law learning discrete-time reward dynamical systems reinforcement learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部