摘要
首先介绍了认知无线电技术产生的背景,以及强化学习的发展和应用于认知领域的优势;接着对强化学习的基本原理及其2个常见的模型Q-Learning和POMDP作了介绍,并对其模型定义、思想、所要描述的问题和使用的场景都做了较详细的阐述;然后针对这个方向最近几年的顶级会议和期刊论文,分析了其主要内容;通过最近几年的学术、会议论文中所述的研究现状及成果,说明强化学习的主要特点是能够准确、快速学习到最优策略,能够模拟真实环境,自适应性强,提高频谱感知、分配效率,从而最大化系统吞吐量,这些优势充分证明了强化学习将是认知领域里一种很有前景的技术。
This essay briefly sketches the background and characteristic of cognitive radio and reinforcement learning tech- nology. It reviews the main research direction of the field of cognitive radio for dynamic spectrum allocation ( DSA), inclu- ding the introduction of the two common models in Reinforcement Learning: Q-learning and partially observable markov de- cision process (POMDP). And we analyze the research contents and developments for DSA on the basis of the two models in recent years. Finally, we deduce a conclusion and forecast the development trend of this field in the future.
出处
《数字通信》
2012年第4期34-38,共5页
Digital Communications and Networks