This paper presents the multi-step Q-learning(MQL)algorithm as an autonomic approach to thejoint radio resource management(JRRM)among heterogeneous radio access technologies(RATs)in theB3G environment.Through the'...This paper presents the multi-step Q-learning(MQL)algorithm as an autonomic approach to thejoint radio resource management(JRRM)among heterogeneous radio access technologies(RATs)in theB3G environment.Through the'trial-and-error'on-line learning process,the JRRM controller can con-verge to the optimized admission control policy.The JRRM controller learns to give the best allocation foreach session in terms of both the access RAT and the service bandwidth.Simulation results show that theproposed algorithm realizes the autonomy of JRRM and achieves well trade-off between the spectrum utilityand the blocking probability comparing to the load-balancing algorithm and the utility-maximizing algo-rithm.Besides,the proposed algorithm has better online performances and convergence speed than theone-step Q-learning(QL)algorithm.Therefore,the user statisfaction degree could be improved also.展开更多
基金the National Natural Science Foundation of China(No.60632030)the National High Technology Research and Development Program of China(No.2006AA01Z276)
文摘This paper presents the multi-step Q-learning(MQL)algorithm as an autonomic approach to thejoint radio resource management(JRRM)among heterogeneous radio access technologies(RATs)in theB3G environment.Through the'trial-and-error'on-line learning process,the JRRM controller can con-verge to the optimized admission control policy.The JRRM controller learns to give the best allocation foreach session in terms of both the access RAT and the service bandwidth.Simulation results show that theproposed algorithm realizes the autonomy of JRRM and achieves well trade-off between the spectrum utilityand the blocking probability comparing to the load-balancing algorithm and the utility-maximizing algo-rithm.Besides,the proposed algorithm has better online performances and convergence speed than theone-step Q-learning(QL)algorithm.Therefore,the user statisfaction degree could be improved also.