基于强化学习的智能I/O调度算法被引量：2

An Intelligent I/O Scheduling Algorithm Based on Reinforcement Learning

下载PDF

导出

摘要利用机器学习方法解决存储领域中若干技术难题是目前存储领域的研究热点之一。强化学习作为一种以环境反馈作为输入、自适应环境的特殊的机器学习方法,能通过观测环境状态的变化,评估控制决策对系统性能的影响来选择最优的控制策略,基于强化学习的智能RAID控制技术具有重要的研究价值。本文针对高性能计算应用特点,将机器学习领域中的强化学习技术引入RAID控制器中,提出了基于强化学习的智能I/O调度算法RL-scheduler,利用Q-学习策略实现了面向并行应用的自治调度策略。RL-scheduler综合考虑了调度的公平性、磁盘寻道时间和MPI应用的I/O访问效率,并提出多Q-表交叉组织方法提高Q-表的更新效率。实验结果表明,RL-scheduler缩短了并行应用的平均I/O服务时间,提高了大规模并行计算系统的I/O吞吐率。 To improve the I/O service efficiency of RAID and optimize the I/O performance of parallel applications,the paper presents an intelligent I/O scheduling algorithm,RLscheduler,in the RAID controllers based on reinforcement learning.RLscheduler utilizes the Qlearning strategy to implement a selfcontrol and selfoptimization scheduler.The algorithm leverages the scheduling equity,disk seeking time and the I/O access efficiency of the MPI applications.Furthermore,the proposed interleaving organization of multiple Qtables improves the efficiency of the Qtable updating.The experimental results show that,on a largescale parallel system with multiple parallel applications,RLscheduler shortens the average I/O waiting time of parallel applications considerably,thus increases the practical I/O throughput of largescale parallel systems.

作者李琼郭御风蒋艳凰

机构地区国防科学技术大学计算机学院

出处《计算机工程与科学》 CSCD 北大核心 2010年第7期58-61,共4页 Computer Engineering & Science

基金装备预研项目(51316040301)

关键词机器学习强化学习智能I/O调度 RAID控制器 machine learning reinforcement learning intelligent I/O control RAID controller

分类号 TP303 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献4

1Massiglia P. The RAID Book: A Storage System Technology Handbook[M]. RAID Advisory Board, 1997.
2Mitchell T M. Machine Learning[J]. McGraw-Hill Science/ Engineering/Math, 1997, 41(1) :1-7.
3Ipek E, Mutlu O, Martnez J F, et al. Self-Optimizing Memory Controllers:A Reinforcement Learning Approach [C]//Proc of Int'l Symp on Computer Architecture, 2008:39-50.
4Zhang Yu, Bhargava B. Self-Learning Disk Scheduling [J]. IEEE Trans on Knowledge and Data Engineering, 2009, 21 (1):50-65.

同被引文献34

1何源,张文生.基于核方法的强化学习算法[J].微计算机信息,2008,24(4):243-245. 被引量：1
2王雪辉,李世杰,张玉芝.Multi-Agent技术在车间调度中的应用[J].河北工业大学学报,2005,34(2):105-109. 被引量：7
3陈宗海,文锋,聂建斌,吴晓曙.基于节点生长k-均值聚类算法的强化学习方法[J].计算机研究与发展,2006,43(4):661-666. 被引量：13
4高阳,胡景凯,王本年,王冬黎.基于CMAC网络强化学习的电梯群控调度[J].电子学报,2007,35(2):362-365. 被引量：13
5Su S, Lee Z, Wang Y. Robust and fast learning for fuzzy cerebellar model articulation controllers. IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics,2006,36(1):203-208.
6Ernst D, Geurts P, Wehenkel L. Tree-Based Batch Mode Reinforcement Learning. Journal of Machine Learning Research,2006,6(1): 503-556.
7Christopher Kenneth Monson. Reinforcement learning in the joint space: value iteration in worlds with continuous states and actions. Master of Science, Brigham Young University, 2003.
8Baird L. Residual algorithm: Reinforcement learning with function approximation. In Proceedings of the Twelfth International Confer- ence on Machine Learning, Morgan Kaufmann, 1995:30-37.
9Remi Munos, Andrew Moore Variable Resolution Discretization in Optimal Control. Machine Learning, 2002, 49: 291-323.
10Shimon Whiteson, Peter Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 2006, 7: 877-917.

引证文献2

1夏丽丽.连续状态-连续行动强化学习[J].电脑知识与技术,2011,7(7):4669-4672. 被引量：2
2叶婉秋.基于RL的遗传算法的制造车间生产调度研究[J].电脑知识与技术,2016,0(9):218-219. 被引量：1

二级引证文献3

1任志鸿.基于象棋博弈问题中评价函数的分析与探讨[J].电脑与信息技术,2012,20(6):26-27.
2程鹏,谢小年.基于BP神经网络的Q-学习可变限速控制对拥堵路段交通流的优化[J].山东交通学院学报,2017,25(3):38-43. 被引量：2
3曾艾婧,刘永姜,孟小玲,温海骏,邵延君.战后武器装备车间调度遗传算法的参数优化[J].火力与指挥控制,2020,45(12):153-159. 被引量：1

1谭剑波,谢川,石忠东,杨文通,吴喜文,王建华.基于CAN总线的智能I/O站点设计[J].微计算机信息,2005,21(10Z):1-2. 被引量：4
2邓李.PAC—先进的控制解决之道[J].软件,2009,30(6):44-47.
3蒋句平.I/O新技术──智能I/O[J].电子计算机与外部设备,1998,22(4):64-67.
4陈琼,张江陵.高性能磁盘阵列I/O服务时间的分析[J].小型微型计算机系统,2000,21(3):235-237. 被引量：8
5王文丰,赵跃龙,曾文英,余斌.一种网络存储技术新方案--智能网络磁盘集群存储系统[J].小型微型计算机系统,2008,29(7):1211-1214. 被引量：3
6新的I／O器件，库PLC，DCS和PC提供更小，更安全的连接[J].国内外机电一体化技术,1998(1):53-56.
7应用最新技术PLC变得更小[J].国内外机电一体化技术,1998(1):30-32.
8李彬,萧德云,张正芳.基于单片机和CPLD的智能I/O模块设计[J].计算机工程与应用,2006,42(36):66-69. 被引量：5
9鲁聪,华锦忠.通用智能I／O采集控制板设计及实现[J].电脑开发与应用,1995,8(4):6-12.
10邵明,刘建成.基于CAN总线的智能I/O模块的设计[J].福建电脑,2008(1):15-16.

计算机工程与科学

2010年第7期

浏览历史

内容加载中请稍等...

基于强化学习的智能I/O调度算法被引量：2

参考文献4

同被引文献34

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于强化学习的智能I/O调度算法 被引量：2

参考文献4

同被引文献34

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于强化学习的智能I/O调度算法被引量：2