摘要
以最小化平均消耗功率为目标,提出了一种具有服务质量保障的用户调度和功率分配机制。每个用户维持一个用于存储随机到达业务的数据队列,用户的服务质量要求被刻画成平均排队时延。基于无线信道和数据队列长度的动态变化,将用户调度和功率分配刻画成一个带有约束条件的马尔可夫决策问题。为了应对系统难以精确获取信道和数据到达过程分布参数的情况,采用Q学习算法求解马尔科夫决策问题,进而提出了一种在线学习的用户调度和功率控制算法。系统通过在线学习进行用户调度和功率分配,以实现平均消耗功率的最小化目标。
In this paper,a joint user scheduling and power control scheme is investigated for systems with quality of service(QoS)requirements to minimize the time-averaged power consumption.Each user maintains a buffer to store random arrival data,and the QoS requirements are characterized by the time-averaged queuing delay.The joint user scheduling and power control is formulated as a constrained Markov decision problem(CMDP) according to the dynamics of the wireless channels and the length of data buffers.An online learning based algorithm is proposed by solving the CMDP with the aid of Q-learning approach,based on which the system,without prior knowledge of the distributions of wireless channels and data arrivals,can make a user scheduling and power control decision to minimize the time-averaged power consumption.
出处
《无线电通信技术》
2018年第1期78-81,共4页
Radio Communications Technology
基金
河北自然科学基金项目(F2014210123)