期刊文献+

WSN中基于强化学习的能效优化任务处理机制

Energy Efficiency Optimization Task Processing Mechanism Based on Reinforcement Learning in WSN
下载PDF
导出
摘要 以提高无线传感器网络中任务处理的能效为目标,提出了一种近似最优化的任务处理机制,无线传感器节点可根据任务缓存区的任务数量、信道条件,动态地实现任务向边缘服务器的卸载以及本地处理。将任务处理机制建模为马尔可夫决策过程,因为无线传感器节点不知道此过程的状态转移概率,所以采用A3C算法以实现在环境参数未知情况下的探索和学习,从而得到近似最优的任务处理策略。仿真结果表明,与其他机制相比,所提任务处理机制能提高节点能效,且收敛速度更快。 Aiming at improving the energy efficiency of task processing in wireless sensor networks,a nearly optimal task processing mechanism is proposed,in which wireless sensor nodes can dynamically unload tasks to edge servers and perform local processing according to the number of tasks in the task cache and channel conditions.The task processing mechanism is modeled as a Markov decision process.Since the wireless sensor node does not know the state transition probability of this process,the A3C algorithm is used to realize exploration and learning under unknown environmental parameters,so as to obtain the approximately optimal task processing strategy.Under certain buffer conditions and channel conditions,the optimal task quantity,modulation level and transmission power are selected by this strategy,and the average task processing energy efficiency is improved.Simulation results show that compared with other mechanisms,the proposed task-processing mechanism can improve node energy efficiency and has faster convergence speed.
作者 张明杰 朱江 ZHANG Mingjie;ZHU Jiang(Chongqing Key Laboratory of Mobile Communications Technology,Engineering Research Center of Mobile Communications of the Ministry of Education,School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处 《信号处理》 CSCD 北大核心 2022年第3期609-618,共10页 Journal of Signal Processing
基金 国家自然科学基金(61271260) 重庆市教委科学技术研究项目(KJ1400416)资助课题。
关键词 无线传感器网络 移动边缘计算 马尔可夫决策过程 强化学习 wireless sensor network mobile edge computing Markov decision process reinforcement learning
  • 相关文献

参考文献3

二级参考文献14

  • 1Hossain E and Bhargava V.Cognitive Wireless Communication Networks[M].First Edition,New York:Springer,2007:1-301.
  • 2Djonin D V,et al..Joint rate and power adaptation for type-I hybrid ARQ systems over correlated fading channels under different buffer cost constraints[J].IEEE Transactions.on Wireless Communications,2008,57(1):421-435.
  • 3Bolch G,et al..Queueing Networks and Markov Chains:Modeling and Performance Evaluation with Computer Science Applications[M].Second Edition,New York:John Wiley & Sons,2006:185-206.
  • 4Chung Seong Taek and Goldsmith A.Degrees of freedom in adaptive modulation:A unified view[J].IEEE Transactions.on Communications,2001,49(9):1561-1571.
  • 5Chang H S,et al..Simulation-based Algorithms for Markov Decision Processes[M].First Edition,London:Springer-Verlag,2007:9-167.
  • 6Beutle F J and Ross K W.Optimal policies for controlled markov chains with a constraint[J].Journal of Mathematical Analysis and Application,1985,112(1):236-252.
  • 7Hossain M J,et al..Delay limited optimal and suboptimal power and bit loading algorithms for OFDM systems over correlated fading[C].IEEE GLOBECOM,St.Louis,USA,Dec.1-2,2005:3448-3453.
  • 8Pandana C and Liu K J R.Near-optimal reinforcement learning framework for energy-aware sensor communications[J].IEEE Transactions.on Wireless Communications,2005,23(4):788-797.
  • 9顾晨阳,李丁山,李含辉.单载波频域均衡系统信道估计的粒子滤波方法[J].信号处理,2014,30(4):483-488. 被引量:7
  • 10Li Feng,Jiguo Yu,Feng Zhao,Honglu Jiang.A Novel Analysis of Delay and Power Consumption for Polling Schemes in the IoT[J].Tsinghua Science and Technology,2017,22(4):368-378. 被引量:3

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部