摘要
在不远的未来,ISAC系统将同时提供通信和感知服务。ISAC系统需要通过先进的波束优化算法保证所提供服务的质量,并满足形式多样的服务目标和资源约束。通常,波束算法可建模为一个优化问题。然而,基于传统优化理论设计的优化算法仅能处理带有瞬时约束的资源分配问题,而不能处理带有长时间约束的优化问题,从而降低了系统性能。一种可行的解决方案是基于RL理论设计相应算法来解决上述问题。然而,现有的工作主要致力于解决无约束RL问题,对约束强化学习问题关注较少,这限制了强化学习在波束优化问题中的应用。为了克服上述挑战,提出了一种基于CSSCA的RL方法。该方法将原有的目标函数和约束函数替换为对应的凸近似函数,通过求解一系列的凸近似问题,最终可以保证收敛到原问题的KKT点。最后,通过仿真结果展示了所提出方法的优越性。
In future,integrated sensing and communication(ISAC)systems are expected to provide communication and sensing service simultaneously.The systems are required to perform advanced beamforming algorithms to ensure the quality of service and satisfy various types of service targets and resource constraints.In general,the beamforming algorithms can be formulated as an optimization problem.However,the optimization algorithm based on the traditional optimization theory can only address the resource allocation problems with instantaneous constraints and fail to address the problems with long-term constraints,degrading the system performance.One possible solution to overcome the drawbacks of existing algorithms is designing optimization algorithms based on the reinforcement learning.However,the existing algorithms only focus on the unconstrained reinforcement learning problems and pay little attention on the constrained reinforcement learning ones,which restricts the application of reinforcement learning in beamforming algorithm design.To tackle this challenge,we propose a novel reinforcement learning algorithm based on the constrained successive convex approximation method.This method replaces the original objective function and constraint functions with the corresponding convex approximation functions.By solving a series of convex approximation problems,the convergence to the Karush-Kuhn-Tucker(KKT)point of the original problem can be guaranteed.Finally,the simulation results show the superiority of the proposed method.
作者
黄哲
刘安
HUANG Zhe;LIU An(Zhejiang University,Hangzhou 310007,China)
出处
《移动通信》
2024年第10期41-48,共8页
Mobile Communications
基金
国家自然科学基金“基于深度随机优化的联合压缩信道估计与定位跟踪方法”(62071416)。
关键词
通信感知一体化
波束优化
深度强化学习
约束随机逐次凸逼近
Integrated sensing and communication
beamforming optimization
deep reinforcement learning
constrained successive convex approximation