摘要
研究机会式频谱接入技术中探测与接入策略的优化问题.首先,以与原问题等价的信度马尔可夫决策过程为基本模型,基于性能势的核心概念,从性能灵敏度的角度出发,分析不同策略下系统的性能差异,给出了优化探测与接入策略的迭代算法;然后,通过分析系统的样本路径,结合该问题中连续状态空间可集结的特点,进一步讨论了策略迭代算法的基于样本路径的具体实现.两个仿真示例验证了算法的有效性.
The sense and access optimization problem in opportunistic spectrum access technology is considered.Based on the belief Markov decision process model,which is equivalent to the original partially observable Markov decision process,the performance differences between two different policies are investigated from a sensitivity-based view with the help of the performance potential.Then the policy iteration algorithm is designed.By analyzing the sample path of the system,the sample-path based policy iteration algorithm is developed.Two examples are provided to illustrate the effectiveness of the algorithm.
出处
《控制与决策》
EI
CSCD
北大核心
2010年第6期857-861,866,共6页
Control and Decision
基金
国家自然科学基金项目(60574064
60736027)
关键词
机会式频谱接入技术
信度马尔可夫决策过程
性能势
策略迭代
Opportunistic spectrum access
Belief Markov decision process
Performance potential
Policy iteration