期刊文献+

基于Q学习的自主联合无线资源管理算法 被引量:9

A Q-learning Based Autonomic Joint Radio Resource Management Algorithm
下载PDF
导出
摘要 该文提出了一种基于Q学习的联合无线资源管理(JRRM)算法,用于异构无线接入技术条件下B3G系统的自主资源优化。JRRM控制器通过与无线环境的"试错"交互,学会为每个会话分配合适的接入技术和业务带宽。为降低存储需求,算法引入了反向传播神经网络用于泛化其输入状态空间。仿真结果表明,该算法不仅通过在线学习实现了JRRM的自主化,且在频谱效用和阻塞率之间获得了很好的性能折衷。 A Q-learning based Joint Radio Resource Management (JRRM) algorithm is proposed for the autonomic resource optimization in a B3G system with heterogeneous Radio Access Technologies (RAT). Through the "trial-and-error" interactions with the radio environment, the JRRM controller learns to allocate the proper RAT and the service bandwidth for each session. A backpropagation neural network is adopted to generalize the large input state space to reduce memory requirement. Simulation results show that the proposed algorithm not only realizes the autonomy of JRRM through the online learning process, but also achieves well trade-off between the spectrum utility and the blocking probability.
出处 《电子与信息学报》 EI CSCD 北大核心 2008年第3期676-680,共5页 Journal of Electronics & Information Technology
基金 欧盟FP6端到端重配置(IST-2005-027714) 国家自然科学基金重点项目(60502035) 国家863计划(2006AA01Z276) 科技部中欧盟科技合作项目(0516)资助课题
关键词 无线接入技术(RAT) 联合接纳控制 带宽分配 Q学习 神经网络 Radio Access Technology (RAT) Joint admission control Bandwidth allocation Q-learning Neural network
  • 相关文献

参考文献10

  • 1Song Q and Jamalipour A. Network selection in an integrated wireless LAN and UMTS environment using mathematical modeling and computing techniques[J]. IEEE Wireless Commun., 2005, 12(3): 42-48.
  • 23GPP TR 25.881 v5.0.0. Improvement of RRM across RNS and RNS/BSS (Release 5) [OL]. http://www.3gpp.org, Dec. 2001.
  • 3IST-2003-507995 Project E2R (End-to-End Reconfigurability) [OL]. http://e2r.motlabs.com, Jan. 2004.
  • 4Agusti R, Salient O, and Perez-Romero J, et al.. A fuzzyneural based approach for joint radio resource management in a beyond 3G framework[C]. First Int. Conf. on Quality of Service in Heterogeneous Wired/Wireless Networks, Barcelona, Mar. 2004: 216-224.
  • 5Luo J, Mohyeldin E, and Dillinger M, et al.. Performance analysis of joint radio resource management for reconfigurable terminals with multi-class circuit-switched services[C]. Wireless World Research Forum 12th Meeting, Toronto, Nov. 2004: 138-150.
  • 6Zhang Y, Zhang K, and Ji Y, et al.. Adaptive threshold joint load control in an end-to-end reconfigurable systemiC]. IST Mobile and Wireless Summit 2006, Mykonos, Jun. 2006: 332-337.
  • 7Kaelbling L P, Littman M L, and Wang X, et al..Reinforcement learning: a survey[J]. Journal of Artificial Intelligence Research, 1996, 4(2): 237-285.
  • 8Nie J and Haykin S. A Q-learning-based dynamic channel assignment technique for mobile communication systems[J]. IEEE Trans. on Vehicular Technology, 1999, 48(5): 1676- 1687.
  • 9Watkins C J C H and Dayan P. Q-learning[J]. Machine Learning, 1992, 8(3): 279-292.
  • 10Radunovic B, Le Boudec J Y. Rate performance objectives of multihop wireless networks[J]. IEEE Trans. on Mobile Computing, 2004, 3(4): 334-349.

同被引文献98

  • 1杨光,余凯,章魁,张平.基于IEEE 802.11e MAC的联合无线资源管理[J].重庆邮电学院学报(自然科学版),2005,17(6):662-666. 被引量:1
  • 2罗强.端到端重配置技术研究[J].电信科学,2006,22(12):40-45. 被引量:2
  • 3HAYKIN S. Cognitive radio: brain-empowered wireless communications[J]. IEEE Journal on Selected Areas in Communications, 2005,23 (2) : 201-220.
  • 4KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: a survey[J].Journal of Artificial Intelligence Research, 1996 (4) :237-285.
  • 5NIE J, HAYKIN S. A Q-learning-based dynamic channel assignment technique for mobile communication systems[J]. IEEE Trans on Vehicular Technology, 1999, 48: 1676-1687.
  • 6WATKINS C J C H. Learning from delayed rewards[D]. Cambridge: Cambridge University, 1989.
  • 7WATKINS C J C H, DAYAN P. Q-learning[J]. Machine Learning, 1992,8:279-292.
  • 8NGAI D C K, YUNG N H C. Double action Q-learning for obstacle avoidance in a dynamically changing environment[C]. Las Vegas: Proceedings of the 2005 IEEE Intelligent Vehicles Symposium, 2005.
  • 9NGAI D C K, YUNG N H C. Performance evaluation of double action Q-Learning in moving obstacle avoidance problem[C]. Hawaii:Proceedings of the 2005 IEEE International Conference on Systems, Man, and Cybernetics, 2005.
  • 10WAEL ABD-ALMAGEED, ALY I EL-OSERY, CHRISTOPHER E S. Estimating time-varying densities using a stochastic learning automaton [J]. Soft Computing-A Fusion of Foundations, Methodologies and Applications, 2006,10(11):1007-1020.

引证文献9

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部