摘要
文章研究了计时报酬方式下最优呼叫接入控制问题,建立了系统的连续时间Markov决策过程(CT-MDP),根据系统特征引入后状态Q值更新方法,给出呼叫接入控制问题基于事件驱动Q学习优化算法,并给出一个数值仿真实例;仿真结果表明,该算法比Q学习具有收敛速度快、存储空间小的优势;根据实验结果分析了在最优策略下业务拒绝率与业务特征的关系。
Optimal call admission control(CAC) based on time compensation is concerned in this paper. The continuous-time Markov decision processes(CTMDP) for the system is established, and a method of afterstate Q-value updating is introduced according to the characteristics of the system. Then an optimal algorithm of event driven Q-learning is proposed to solve the call admission control problem. Finally, an example of numerical simulation is given. The simulation results show that the proposed al- gorithm needs less memory and has faster convergence than Q-learning. And on the basis of the experimental results, the relationship between the rejection rate of business and the characteristics of business is analyzed under optimal call admission policy.
出处
《合肥工业大学学报(自然科学版)》
CAS
CSCD
北大核心
2011年第1期76-79,共4页
Journal of Hefei University of Technology:Natural Science
基金
国家自然科学基金资助项目(60873003)
教育部回国人员科研启动基金资助项目(2009AKZR0279)
安徽省自然科学基金资助项目(090412046)
安徽省高校自然科学研究重点资助项目(KJ2008A058)