期刊文献+

无线网络中基于深度Q学习的传输调度方案 被引量:6

Transmission scheduling scheme based on deep Q learning in wireless network
下载PDF
导出
摘要 针对无线网络中的数据传输问题,提出一种基于深度Q学习(QL,Q learning)的传输调度方案。该方案通过建立马尔可夫决策过程(MDP,Markov decision process)系统模型来描述系统的状态转移情况;使用Q学习算法在系统状态转移概率未知的情况下学习和探索系统的状态转移信息,以获取调度节点的近似最优策略。另外,当系统状态的规模较大时,采用深度学习(DL,deep learning)的方法来建立状态和行为之间的映射关系,以避免策略求解中产生的较大计算量和存储空间。仿真结果表明,该方法在功耗、吞吐量、分组丢失率方面的性能逼近基于策略迭代的最优策略,且算法复杂度较低,解决了维灾问题。 To cope with the problem of data transmission in wireless networks,a deep Q learning based transmission scheduling scheme was proposed.The Markov decision process system model was formulated to describe the state transition of the system.The Q learning algorithm was adopted to learn and explore the system states transition information in the case of unknown system states transition probability to obtain the approximate optimal strategy of the schedule node.In addition,when the system state scale was big,the deep learning method was employed to map the relation between state and behavior to solve the problem of the large amount of computation and storage space in Q learning process.The simulation results show that the proposed scheme can approach the optimal strategy based on strategy iteration in terms of power consumption,throughput,packets loss rate.And the proposed scheme has a lower complexity,which can solve the problem of the curse of dimensionality.
作者 朱江 王婷婷 宋永辉 刘亚利 ZHU Jiang;WANG Tingting;SONG Yonghui;LIU Yali(Key Laboratory of Information and Communication Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065)
出处 《通信学报》 EI CSCD 北大核心 2018年第4期35-44,共10页 Journal on Communications
基金 国家自然科学基金资助项目(No.61102062 No.61271260 No.61301122) 重庆市基础与前沿研究计划基金资助项目(No.cstc2015jcyj A40050)~~
关键词 无线网络传输 马尔可夫决策过程 Q学习 深度学习 wireless network transmission Markov decision process Q learning deep learning
  • 相关文献

参考文献2

二级参考文献22

  • 1Hossain E and Bhargava V.Cognitive Wireless Communication Networks[M].First Edition,New York:Springer,2007:1-301.
  • 2Djonin D V,et al..Joint rate and power adaptation for type-I hybrid ARQ systems over correlated fading channels under different buffer cost constraints[J].IEEE Transactions.on Wireless Communications,2008,57(1):421-435.
  • 3Bolch G,et al..Queueing Networks and Markov Chains:Modeling and Performance Evaluation with Computer Science Applications[M].Second Edition,New York:John Wiley & Sons,2006:185-206.
  • 4Chung Seong Taek and Goldsmith A.Degrees of freedom in adaptive modulation:A unified view[J].IEEE Transactions.on Communications,2001,49(9):1561-1571.
  • 5Chang H S,et al..Simulation-based Algorithms for Markov Decision Processes[M].First Edition,London:Springer-Verlag,2007:9-167.
  • 6Beutle F J and Ross K W.Optimal policies for controlled markov chains with a constraint[J].Journal of Mathematical Analysis and Application,1985,112(1):236-252.
  • 7Hossain M J,et al..Delay limited optimal and suboptimal power and bit loading algorithms for OFDM systems over correlated fading[C].IEEE GLOBECOM,St.Louis,USA,Dec.1-2,2005:3448-3453.
  • 8Pandana C and Liu K J R.Near-optimal reinforcement learning framework for energy-aware sensor communications[J].IEEE Transactions.on Wireless Communications,2005,23(4):788-797.
  • 9Uysal B E, Prabhakar B, Gamal A E. Energy-efficient packet transmission over a wirdess link. IEEE/ACM Trans. on Netzvorking, 2002, 10(4) : 487 - 499.
  • 10Shurgers C, Aberthorne O, Srivastava M B. Modulation scaling for energy aware communication systems:C://Proc, of the In- ternational symposium on Low Power Electronics and Design, 2001:96 - 99.

共引文献4

同被引文献54

引证文献6

二级引证文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部