期刊文献+

基于SARSA算法的水声通信自适应调制 被引量:4

Adaptive Modulation of Underwater Acoustic Communication Based on SARSA Algorithm
下载PDF
导出
摘要 水声信道复杂多变,自适应调制系统中反馈信息存在较大的时延,实际信道状态与接收到的反馈信息无法匹配,带来反馈信道状态信息过时问题,发送端不能准确做出自适应决策进而导致传输误码高及吞吐量低等问题。针对该问题,利用强化学习中的SARSA算法学习信道的变化并进行行为策略的选择,根据信道的变化,择优选出最佳的调制方式,以改善系统的传输误码和通信吞吐量。对比固定调制方式和直接反馈情况下的系统的误码率和吞吐量,结果表明,经强化学习后的系统误码率和吞吐量均优于其他两种方式,可见,强化学习算法在时变水声信道自适应调制中改善传输误码和吞吐量的问题上是有效可行的。 The underwater acoustic channel is complex and variable.The feedback information in the adaptive modulation system has a large delay.The actual channel state cannot be matched with the received feedback information,which leads to the feedback channel state information outdated.The transmitter cannot accurately make adaptive decisions,leading to high transmission errors and low throughput.Aiming at this problem,the SARSA algorithm in reinforcement learning was used to learn the channel variation and selected the behavior strategy.According to the channel variation,the optimal modulation mode was selected to improve the transmission error and communication throughput of the system.The results show that the bit error rate and throughput of the system after reinforcement learning is better than the other two methods in comparison to the bit error rate and throughput of the system under fixed modulation and direct feedback.It can be seen that the reinforcement learning algorithm is effective and feasible in improving the performance of transmission error and throughput in adaptive modulation of time-varying underwater acoustic channels.
作者 王安义 李萍 张育芝 WANG An-yi;LI Ping;ZHANG Yu-zhi(School of Communication and Information Engineering, Xi’an University of Science and Technology, Xi’an 710054, China)
出处 《科学技术与工程》 北大核心 2020年第16期6505-6509,共5页 Science Technology and Engineering
基金 国家自然科学基金(61801372) 陕西省教育厅科研计划(18JK0499) 西安科技大学培育基金(201747)。
关键词 水声通信 自适应调制 强化学习 SARSA算法 underwater acoustic communication adaptive modulation reinforcement learning SARSA algorithm
  • 相关文献

参考文献7

二级参考文献79

  • 1杨洋,陈小平.动态不确定环境下的决策:一种分层决策模型[J].计算机科学,2005,32(1):151-154. 被引量:1
  • 2苏畅,高阳,陈世福,陈兆乾.基于SMDP环境的自主生成options算法的研究[J].模式识别与人工智能,2005,18(6):679-684. 被引量:9
  • 3秦志斌,钱徽,朱淼良.自主移动机器人混合式体系结构的一种Multi-agent实现方法[J].机器人,2006,28(5):478-482. 被引量:8
  • 4原魁,李园,房立新.多移动机器人系统研究发展近况[J].自动化学报,2007,33(8):785-794. 被引量:73
  • 5AL-BATAH M S,MATISA N A,ZAMLI K Z,et al.Modified recursive least squares algorithm to train the hybrid multilayered perceptron (HMLP) network[J].Applied Soft Computing,2010,10(1):236-244.
  • 6BOWLING M.Multi agent learning in the presence of agents with limi-tations[R].Pittsburgh:Carnegie Mellon University,2003.
  • 7KYUN Y,OH S-Y.Hybrid control for autonomous mobile robotnavigation using neural network based behavior modules and environment classification[J].Autonomous Robots,2003,15(2):193-206.
  • 8ARAI S,SYCARA K.Multi-agent reinforcement learning for planning and conflict resolution in a dynamic domain[C] //Proc of the 4th International Conference on Autonomous agents.2000:104-105.
  • 9VRANCY P,VERBEEK K,NOWE A.Decetralized learning in Markov games[J].IEEE Trans on Systems,Man and Cyberne-tics Part B:Cybernetics,2008,38(4):976-981.
  • 10LUCIAN B,ROBERT B,BART D S.A comprehension survey of multiagent reinforcement learning[J].IEEE Trans on Systems,Man and Cybernetics Part C:Applications and Reviews,2008,68(2):156-172.

共引文献373

同被引文献21

引证文献4

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部