期刊文献+
共找到41篇文章
< 1 2 3 >
每页显示 20 50 100
融合用户聚类与Bandits算法的微博推荐模型
1
作者 何羽丰 徐建民 张彬 《小型微型计算机系统》 CSCD 北大核心 2022年第10期2122-2130,共9页
针对微博推荐系统中存在的新用户冷启动和数据稀疏性问题,提出一种微博推荐模型.该模型通过重要用户聚类和普通用户分类构建完整用户类,基于类兴趣表征普通用户兴趣,利用Bandits算法为完整用户类中的普通用户产生微博推荐列表,根据普通... 针对微博推荐系统中存在的新用户冷启动和数据稀疏性问题,提出一种微博推荐模型.该模型通过重要用户聚类和普通用户分类构建完整用户类,基于类兴趣表征普通用户兴趣,利用Bandits算法为完整用户类中的普通用户产生微博推荐列表,根据普通用户对推荐列表的反馈更新其所属完整用户类的历史数据,合理应对新用户冷启动,降低了数据稀疏度,实现了较为准确的微博推荐,为微博推荐模型的构建提供了新的思路.实验结果表明,该模型能够推荐给用户感兴趣的博文,推荐效果较现有随机探索类算法、置信区间类算法和概率匹配类算法分别最低提高5.62%、5.43%和33.37%. 展开更多
关键词 微博推荐 用户聚类 bandits算法 冷启动 数据稀疏
下载PDF
基于Fed-DPDOBO的分散式联邦学习
2
作者 杨巨 邓志良 +2 位作者 杨志强 王燕 赵中原 《计算机与现代化》 2024年第4期99-106,共8页
传统的客户-服务器架构联邦学习作为解决数据孤岛问题的有效手段,其中心服务器面临着巨大的带宽压力,分散式的对等架构联邦学习在一定程度上可改善这种情况。然而,联邦学习的客户端还存在着数据隐私泄露的风险,而且其成本函数梯度信息... 传统的客户-服务器架构联邦学习作为解决数据孤岛问题的有效手段,其中心服务器面临着巨大的带宽压力,分散式的对等架构联邦学习在一定程度上可改善这种情况。然而,联邦学习的客户端还存在着数据隐私泄露的风险,而且其成本函数梯度信息在某些情况下很难获得。针对这些问题,本文为一致性约束下的对等架构联邦学习设计一种Federated Differential Privacy Distributed One-point Bandit Online(Fed-DPDOBO)算法,可有效地解决中心服务器带宽限制和客户端梯度信息未知的问题。此外,差分隐私技术的运用,可很好地保护各客户端数据隐私。最后,通过利用MINST数据集进行分散式联邦学习实验,验证本文算法的有效性。 展开更多
关键词 数据孤岛 联邦学习 一致性约束 对等架构 差分隐私 单点Bandit
下载PDF
Adaptive Cyber Defense Technique Based on Multiagent Reinforcement Learning Strategies
3
作者 Adel Alshamrani Abdullah Alshahrani 《Intelligent Automation & Soft Computing》 SCIE 2023年第6期2757-2771,共15页
The static nature of cyber defense systems gives attackers a sufficient amount of time to explore and further exploit the vulnerabilities of information technology systems.In this paper,we investigate a problem where ... The static nature of cyber defense systems gives attackers a sufficient amount of time to explore and further exploit the vulnerabilities of information technology systems.In this paper,we investigate a problem where multiagent sys-tems sensing and acting in an environment contribute to adaptive cyber defense.We present a learning strategy that enables multiple agents to learn optimal poli-cies using multiagent reinforcement learning(MARL).Our proposed approach is inspired by the multiarmed bandits(MAB)learning technique for multiple agents to cooperate in decision making or to work independently.We study a MAB approach in which defenders visit a system multiple times in an alternating fash-ion to maximize their rewards and protect their system.We find that this game can be modeled from an individual player’s perspective as a restless MAB problem.We discover further results when the MAB takes the form of a pure birth process,such as a myopic optimal policy,as well as providing environments that offer the necessary incentives required for cooperation in multiplayer projects. 展开更多
关键词 Multiarmed bandits reinforcement learning MULTIAGENTS intrusion detection systems
下载PDF
Matching while Learning: Wireless Scheduling for Age of Information Optimization at the Edge 被引量:2
4
作者 Kun Guo Hao Yang +2 位作者 Peng Yang Wei Feng Tony Q.S.Quek 《China Communications》 SCIE CSCD 2023年第3期347-360,共14页
In this paper,we investigate the minimization of age of information(AoI),a metric that measures the information freshness,at the network edge with unreliable wireless communications.Particularly,we consider a set of u... In this paper,we investigate the minimization of age of information(AoI),a metric that measures the information freshness,at the network edge with unreliable wireless communications.Particularly,we consider a set of users transmitting status updates,which are collected by the user randomly over time,to an edge server through unreliable orthogonal channels.It begs a natural question:with random status update arrivals and obscure channel conditions,can we devise an intelligent scheduling policy that matches the users and channels to stabilize the queues of all users while minimizing the average AoI?To give an adequate answer,we define a bipartite graph and formulate a dynamic edge activation problem with stability constraints.Then,we propose an online matching while learning algorithm(MatL)and discuss its implementation for wireless scheduling.Finally,simulation results demonstrate that the MatL is reliable to learn the channel states and manage the users’buffers for fresher information at the edge. 展开更多
关键词 information freshness Lyapunov opti-mization multi-armed bandit wireless scheduling
下载PDF
基于Bandit反馈的自适应量化分布式在线镜像下降算法
5
作者 谢俊如 高文华 谢奕彬 《控制理论与应用》 EI CAS CSCD 北大核心 2023年第10期1774-1782,共9页
多智能体系统的在线分布式优化常用于处理动态环境下的优化问题,节点间需要实时传输数据流.在很多情况下,各节点无法获取个体目标函数的全部信息(包括梯度信息),并且节点间信息传输存在一定的通信约束.考虑到非欧投影意义下的镜像下降... 多智能体系统的在线分布式优化常用于处理动态环境下的优化问题,节点间需要实时传输数据流.在很多情况下,各节点无法获取个体目标函数的全部信息(包括梯度信息),并且节点间信息传输存在一定的通信约束.考虑到非欧投影意义下的镜像下降算法在处理高维数据和大规模在线学习上的优势,本文使用个体目标函数在两点处的函数值信息对缺失的梯度信息进行估计,并且根据镜像下降算法的性质设计自适应量化器,提出基于Bandit反馈的自适应量化分布式在线镜像下降算法.然后分析了量化误差界和Regret界的关系,适当选择参数可得所提算法的Regret界为O(√T).最后,通过数值仿真验证了算法和理论结果的有效性. 展开更多
关键词 镜像下降算法 多智能体系统 优化 量化 Bandit反馈
下载PDF
Distributed Weighted Data Aggregation Algorithm in End-to-Edge Communication Networks Based on Multi-armed Bandit 被引量:1
6
作者 Yifei ZOU Senmao QI +1 位作者 Cong'an XU Dongxiao YU 《计算机科学》 CSCD 北大核心 2023年第2期13-22,共10页
As a combination of edge computing and artificial intelligence,edge intelligence has become a promising technique and provided its users with a series of fast,precise,and customized services.In edge intelligence,when ... As a combination of edge computing and artificial intelligence,edge intelligence has become a promising technique and provided its users with a series of fast,precise,and customized services.In edge intelligence,when learning agents are deployed on the edge side,the data aggregation from the end side to the designated edge devices is an important research topic.Considering the various importance of end devices,this paper studies the weighted data aggregation problem in a single hop end-to-edge communication network.Firstly,to make sure all the end devices with various weights are fairly treated in data aggregation,a distributed end-to-edge cooperative scheme is proposed.Then,to handle the massive contention on the wireless channel caused by end devices,a multi-armed bandit(MAB)algorithm is designed to help the end devices find their most appropriate update rates.Diffe-rent from the traditional data aggregation works,combining the MAB enables our algorithm a higher efficiency in data aggregation.With a theoretical analysis,we show that the efficiency of our algorithm is asymptotically optimal.Comparative experiments with previous works are also conducted to show the strength of our algorithm. 展开更多
关键词 Weighted data aggregation End-to-edge communication Multi-armed bandit Edge intelligence
下载PDF
Stochastic programming based multi-arm bandit offloading strategy for internet of things
7
作者 Bin Cao Tingyong Wu Xiang Bai 《Digital Communications and Networks》 SCIE CSCD 2023年第5期1200-1211,共12页
In order to solve the high latency of traditional cloud computing and the processing capacity limitation of Internet of Things(IoT)users,Multi-access Edge Computing(MEC)migrates computing and storage capabilities from... In order to solve the high latency of traditional cloud computing and the processing capacity limitation of Internet of Things(IoT)users,Multi-access Edge Computing(MEC)migrates computing and storage capabilities from the remote data center to the edge of network,providing users with computation services quickly and directly.In this paper,we investigate the impact of the randomness caused by the movement of the IoT user on decision-making for offloading,where the connection between the IoT user and the MEC servers is uncertain.This uncertainty would be the main obstacle to assign the task accurately.Consequently,if the assigned task cannot match well with the real connection time,a migration(connection time is not enough to process)would be caused.In order to address the impact of this uncertainty,we formulate the offloading decision as an optimization problem considering the transmission,computation and migration.With the help of Stochastic Programming(SP),we use the posteriori recourse to compensate for inaccurate predictions.Meanwhile,in heterogeneous networks,considering multiple candidate MEC servers could be selected simultaneously due to overlapping,we also introduce the Multi-Arm Bandit(MAB)theory for MEC selection.The extensive simulations validate the improvement and effectiveness of the proposed SP-based Multi-arm bandit Method(SMM)for offloading in terms of reward,cost,energy consumption and delay.The results showthat SMMcan achieve about 20%improvement compared with the traditional offloading method that does not consider the randomness,and it also outperforms the existing SP/MAB based method for offloading. 展开更多
关键词 Multi-access computing Internet of things OFFLOADING Stochastic programming Multi-arm bandit
下载PDF
对新产品开发的最优价值分析——基于Bandit过程的模型研究 被引量:6
8
作者 谢武 陈晓剑 巩国顺 《预测》 CSSCI 2003年第4期75-77,80,共4页
新产品开发的成败直接关系到企业的生存和发展,因而有效的新产品开发始终是企业追求的目标。本文运用备择Bandit过程的原理对新产品开发的最优价值进行了一定程度的探讨。本文的最后结论认为新产品开发的最优价值取决于Gittins指标法则... 新产品开发的成败直接关系到企业的生存和发展,因而有效的新产品开发始终是企业追求的目标。本文运用备择Bandit过程的原理对新产品开发的最优价值进行了一定程度的探讨。本文的最后结论认为新产品开发的最优价值取决于Gittins指标法则的有效性,即最终取决于市场占有率,对新产品需求预测的准确性,对消费者认知价值预测的准确性以及新产品投放市场的有效性,对这些变量的预测越精确,最优规则越有效,新产品开发成功的价值越大。 展开更多
关键词 新产品开发 顺序 Bandit过程 Gittins定理 最优价值
下载PDF
针对新用户冷启动问题的改进Epsilon-greedy算法 被引量:1
9
作者 王素琴 张洋 +1 位作者 蒋浩 朱登明 《计算机工程》 CAS CSCD 北大核心 2018年第11期172-177,共6页
在解决新用户冷启动问题时,固定不变的Epsilon参数会使传统Epsilon-greedy算法收敛缓慢。为此,提出一种改进的Epsilon-greedy算法。利用免疫反馈模型动态调整Epsilon参数,从而使算法快速收敛。使用蒙特卡罗模拟方法对算法进行实验验证,... 在解决新用户冷启动问题时,固定不变的Epsilon参数会使传统Epsilon-greedy算法收敛缓慢。为此,提出一种改进的Epsilon-greedy算法。利用免疫反馈模型动态调整Epsilon参数,从而使算法快速收敛。使用蒙特卡罗模拟方法对算法进行实验验证,结果表明,该算法能够在用户与推荐系统交互较少的情况下为用户进行有效推荐,且推荐效果优于传统的Epsilon-greedy、Softmax和UCB算法。 展开更多
关键词 推荐系统 冷启动 Epsilon-greedy算法 免疫反馈模型 bandit算法
下载PDF
MOOB:一种改进的基于Bandit模型的推荐算法 被引量:1
10
作者 帖军 孙荣苑 +1 位作者 孙翀 郑禄 《中南民族大学学报(自然科学版)》 CAS 2018年第1期114-119,共6页
提出了一种基于置信区间上界算法的多目标优化推荐算法.该算法可以在保证预测精准度的基础上有效地避免马太效应,并提高推荐系统对长尾物品的挖掘能力.采用Ya Hoo的新闻推荐数据集对算法进行了实验和评价,实验结果表明:多目标优化推荐... 提出了一种基于置信区间上界算法的多目标优化推荐算法.该算法可以在保证预测精准度的基础上有效地避免马太效应,并提高推荐系统对长尾物品的挖掘能力.采用Ya Hoo的新闻推荐数据集对算法进行了实验和评价,实验结果表明:多目标优化推荐算法能够在预测准确率较高的情况下,有效地解决长尾物品发掘问题,避免马太效应,提高推荐系统的精度和广度. 展开更多
关键词 Bandit模型 马太效应 长尾现象 多目标优化 覆盖率
下载PDF
单臂Erlang(k) Bandit报酬过程 被引量:1
11
作者 邹捷中 邓倩 梁友 《长沙电力学院学报(自然科学版)》 2006年第4期69-71,77,共4页
应用贝叶斯方法,对未知Band it报酬过程的抽样报酬基于Erlang(k)分布的单臂Erlang(k)Band it报酬过程提出计算描述最优选择的平衡值序列的算法.有效解决了单臂Erlang(k)Band it报酬过程的最优决策问题,将Band it报酬过程基于的分布从负... 应用贝叶斯方法,对未知Band it报酬过程的抽样报酬基于Erlang(k)分布的单臂Erlang(k)Band it报酬过程提出计算描述最优选择的平衡值序列的算法.有效解决了单臂Erlang(k)Band it报酬过程的最优决策问题,将Band it报酬过程基于的分布从负指数分布推广至目前在实际中应用更为广泛的分布,是对Band it报酬过程的补充和推广.使用本算法通过数值计算可以得到G ittins指数的近似解. 展开更多
关键词 贝叶斯方法 多臂Bandit过程 单臂Bandit过程 Gittins指数 平衡值 Bandit报酬过程 分布
下载PDF
Optimal index shooting policy for layered missile defense system 被引量:1
12
作者 LI Longyue FAN Chengli +2 位作者 XING Qinghua XU Hailong ZHAO Huizhen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2020年第1期118-129,共12页
In order to cope with the increasing threat of the ballistic missile(BM)in a shorter reaction time,the shooting policy of the layered defense system needs to be optimized.The main decisionmaking problem of shooting op... In order to cope with the increasing threat of the ballistic missile(BM)in a shorter reaction time,the shooting policy of the layered defense system needs to be optimized.The main decisionmaking problem of shooting optimization is how to choose the next BM which needs to be shot according to the previous engagements and results,thus maximizing the expected return of BMs killed or minimizing the cost of BMs penetration.Motivated by this,this study aims to determine an optimal shooting policy for a two-layer missile defense(TLMD)system.This paper considers a scenario in which the TLMD system wishes to shoot at a collection of BMs one at a time,and to maximize the return obtained from BMs killed before the system demise.To provide a policy analysis tool,this paper develops a general model for shooting decision-making,the shooting engagements can be described as a discounted reward Markov decision process.The index shooting policy is a strategy that can effectively balance the shooting returns and the risk that the defense mission fails,and the goal is to maximize the return obtained from BMs killed before the system demise.The numerical results show that the index policy is better than a range of competitors,especially the mean returns and the mean killing BM number. 展开更多
关键词 Gittins index shooting policy layered missile defense multi-armed bandits problem Markov decision process
下载PDF
利用Bandit算法解决推荐系统E&E问题 被引量:1
13
作者 高海宾 《韶关学院学报》 2017年第9期22-26,共5页
当前推荐系统开发应用过程中普遍存在着E&E问题,笔者指出了推荐系统中E&E问题的产生和分类,提出用Bandit算法解决这一问题的思路,重点探讨Bandit算法的数学模型和用UCB策略建立的Bandit算法模型,用MATLAB编写了核心仿真程序,并... 当前推荐系统开发应用过程中普遍存在着E&E问题,笔者指出了推荐系统中E&E问题的产生和分类,提出用Bandit算法解决这一问题的思路,重点探讨Bandit算法的数学模型和用UCB策略建立的Bandit算法模型,用MATLAB编写了核心仿真程序,并指出了这种算法模型存在的优点和不足. 展开更多
关键词 Bandit算法 推荐系统 E&E问题
下载PDF
考虑抽样时间间隔的特殊单臂Bandit报酬过程
14
作者 邹捷中 梁友 《铁道科学与工程学报》 CAS CSCD 北大核心 2006年第6期87-90,共4页
应用动态规划向后归纳法和贝叶斯方法,研究了一类特殊单臂Bandit报酬过程的最优决策问题。在这个模型中,未知Bandit过程是抽样时间间隔服从负指数分布,抽样值服从Erlang(2)分布,允许在任意时刻跳转的Bandit报酬过程。讨论了这类Bandit... 应用动态规划向后归纳法和贝叶斯方法,研究了一类特殊单臂Bandit报酬过程的最优决策问题。在这个模型中,未知Bandit过程是抽样时间间隔服从负指数分布,抽样值服从Erlang(2)分布,允许在任意时刻跳转的Bandit报酬过程。讨论了这类Bandit报酬过程Gittins指数的单调性质,并在此基础上将包含这类过程的单臂Bandit报酬过程的最优决策问题简化为一个最优停止问题,构造了计算过程最优停止时间的算法。 展开更多
关键词 贝叶斯方法 特殊单臂Bandit报酬过程 Gittins指灵敏 Erlang(2)布
下载PDF
基于内容和最近邻算法的多臂老虎机推荐算法 被引量:3
15
作者 王高智 肖菁 《华南师范大学学报(自然科学版)》 CAS 北大核心 2019年第1期120-127,共8页
为有效解决推荐系统的冷启动问题和动态数据建模问题,基于多臂老虎机算法与协同过滤算法,利用用户信息反馈在线及时更新推荐模型;将推荐系统的冷启动问题转化成探索和利用(Explore&Exploit,简称E&E)问题,利用多臂老虎机算法,在... 为有效解决推荐系统的冷启动问题和动态数据建模问题,基于多臂老虎机算法与协同过滤算法,利用用户信息反馈在线及时更新推荐模型;将推荐系统的冷启动问题转化成探索和利用(Explore&Exploit,简称E&E)问题,利用多臂老虎机算法,在引入用户特征为内容的基础上,进一步考虑用户之间的协同作用,提出基于内容和最近邻算法的多臂老虎机推荐算法;采用Movielens和Jester的真实数据集进行对比实验,实验结果表明:k NNUCB算法更优且更具实用性,尤其在解决冷启动问题上效果显著. 展开更多
关键词 推荐系统 多臂老虎机 最近邻算法 冷启动 Bandit算法
下载PDF
基于应急融合网络应用的多路径Bandit优化算法
16
作者 伍富 郑霖 李晓记 《计算机工程》 CAS CSCD 北大核心 2017年第3期134-139,共6页
传统的无线通信网络由于结构单一,性能上诸多受限,难以保障应急通信的质量。为此,在认知无线自组织网络与移动蜂窝网络相融合的新背景下,提出一种多路径Bandit算法。将通信中的选路过程分为多时隙路径选择子阶段,通过对权衡网络时延和... 传统的无线通信网络由于结构单一,性能上诸多受限,难以保障应急通信的质量。为此,在认知无线自组织网络与移动蜂窝网络相融合的新背景下,提出一种多路径Bandit算法。将通信中的选路过程分为多时隙路径选择子阶段,通过对权衡网络时延和能效目标函数的计算进行路径优选,从而合理地分布网络中各节点的能耗。仿真结果表明,对比非应急业务应用和贪婪算法,在融合网络应急业务应用下,多路径Bandit算法的网络生存期提高了3%~20%。 展开更多
关键词 融合网络 应急通信 Bandit理论 有限状态马尔科夫链 多路径 多网关
下载PDF
Age of Transmission-Optimal Scheduling for State Update of Multi-Antenna Cellular Internet of Things 被引量:1
17
作者 Song Li Min Li +1 位作者 Ruirui Chen Yanjing Sun 《China Communications》 SCIE CSCD 2022年第4期302-314,共13页
Timely information updates are critical for real-time monitoring and control applications in the Internet of Things(IoT). In this paper, we consider a multi-antenna cellular IoT for state update where a base station(B... Timely information updates are critical for real-time monitoring and control applications in the Internet of Things(IoT). In this paper, we consider a multi-antenna cellular IoT for state update where a base station(BS) collects information from randomly distributed IoT nodes through time-varying channel.Specifically, multiple IoT nodes are allowed to transmit their state update simultaneously in a spatial multiplex manner. Inspired by age of information(AoI),we introduce a novel concept of age of transmission(AoT) for the sceneries in which BS cannot obtain the generation time of the packets waiting to be transmitted. The deadline-constrained AoT-optimal scheduling problem is formulated as a restless multi-armed bandit(RMAB) problem. Firstly, we prove the indexability of the scheduling problem and derive the closed-form of the Whittle index. Then, the interference graph and complementary graph are constructed to illustrate the interference between two nodes. The complete subgraphs are detected in the complementary graph to avoid inter-node interference. Next, an AoT-optimal scheduling strategy based on the Whittle index and complete subgraph detection is proposed.Finally, numerous simulations are conducted to verify the performance of the proposed strategy. 展开更多
关键词 age of transmission information freshness cellular IoT restless multi-armed bandit Whittle index
下载PDF
Millimeter-Wave Concurrent Beamforming:A Multi-Player Multi-Armed Bandit Approach 被引量:1
18
作者 Ehab Mahmoud Mohamed Sherief Hashima +2 位作者 Kohei Hatano Hani Kasban Mohamed Rihan 《Computers, Materials & Continua》 SCIE EI 2020年第12期1987-2007,共21页
The communication in the Millimeter-wave(mmWave)band,i.e.,30~300 GHz,is characterized by short-range transmissions and the use of antenna beamforming(BF).Thus,multiple mmWave access points(APs)should be installed to f... The communication in the Millimeter-wave(mmWave)band,i.e.,30~300 GHz,is characterized by short-range transmissions and the use of antenna beamforming(BF).Thus,multiple mmWave access points(APs)should be installed to fully cover a target environment with gigabits per second(Gbps)connectivity.However,inter-beam interference prevents maximizing the sum rates of the established concurrent links.In this paper,a reinforcement learning(RL)approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links.Specifically,the problem is formulated as a multiplayer multiarmed bandit(MAB),where mmWave APs act as the players aiming to maximize their achievable rewards,i.e.,data rates,and the arms to play are the available beam directions.In this setup,a selfish concurrent multiplayer MAB strategy is advocated.Four different MAB algorithms,namely,ϵ-greedy,upper confidence bound(UCB),Thompson sampling(TS),and exponential weight algorithm for exploration and exploitation(EXP3)are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations.After a few rounds of interactions,mmWave APs learn how to select concurrent beams that enhance the overall system performance.The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution. 展开更多
关键词 Millimeter wave(mmWave) concurrent transmissions reinforcement learning multiarmed bandit(MAB)
下载PDF
基于Bandit反馈的在线分布式镜面下降算法
19
作者 朱小梅 李觉友 《西南大学学报(自然科学版)》 CAS CSCD 北大核心 2022年第1期99-107,共9页
针对在线分布式优化中一类损失函数梯度信息获取困难的问题,提出一种基于Bandit反馈的在线分布式镜面下降(ODMD-B)算法.首先,推广在线分布式镜面梯度下降(ODMD)算法到免梯度的情形,提出了一种新的仅利用函数值信息来对梯度进行估计的方... 针对在线分布式优化中一类损失函数梯度信息获取困难的问题,提出一种基于Bandit反馈的在线分布式镜面下降(ODMD-B)算法.首先,推广在线分布式镜面梯度下降(ODMD)算法到免梯度的情形,提出了一种新的仅利用函数值信息来对梯度进行估计的方法即Bandit反馈,其关键在于利用损失函数值信息逼近梯度信息,能有效克服梯度信息难以获取或计算复杂的困难.然后,给出算法的收敛性分析.结果表明算法的收敛速度为O(T),其中T是迭代次数.最后,使用投资组合选择模型进行了数值仿真实验.实验结果表明,ODMD-B算法的收敛速度与已有的ODMD算法的收敛速度接近.对比ODMD算法,本文所提出算法的优点在于仅仅使用了计算花费较小的函数值信息,使其更适用于梯度信息难以获取的优化问题. 展开更多
关键词 在线学习 分布式优化 镜面下降算法 Bandit反馈 Regret界
下载PDF
A DISTRIBUTED COOPERATIVE RELAYING OPTIMIZATION SCHEME FOR SECONDARY TRANSMISSION IN COGNITIVE RADIO NETWORKS
20
作者 Chert Dan Ji Hong 《Journal of Electronics(China)》 2011年第1期8-14,共7页
In Cognitive Radio(CR) networks,cooperative communication has been recently regarded as a key technology for improving the spectral utilization efficiency and ensuring the Quality of Service(QoS) for Primary Users(PUs... In Cognitive Radio(CR) networks,cooperative communication has been recently regarded as a key technology for improving the spectral utilization efficiency and ensuring the Quality of Service(QoS) for Primary Users(PUs).In this paper,we propose a distributed joint relay selection and power allocation scheme for cooperative secondary transmission,taking both Instantaneous Channel State Information(I-CSI) and residual energy into consideration,where secondary source and destination may have different available spectrum.Specifically,we formulate the cognitive relay network as a restless bandit system,where the channel and energy state transition is characterized by the finite-state Markov chain.The proposed policy has indexability property that dramatically reduces the computation and implementation complexity.Analytical and simulation results demonstrate that our proposed scheme can efficiently enhance overall system reward,while guaranteeing a good tradeoff between achievable date rate and average network lifetime. 展开更多
关键词 Cognitive Radio(CR) Secondary transmission Relay selection Power allocation Restless bandit system
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部