基于数据流块的空间指令调度方法被引量：5

The Data-Flow Block Based Spatial Instruction Scheduling Method

下载PDF

导出

摘要分簇超标量处理器将硬件资源分区来避免大的单体部件导致的功耗与周期惩罚,动态多核处理器融合多个物理核的硬件资源提供适应程序需求的计算能力,这些结构合理使用空间分布的硬件资源实现高能效的计算.空间分区结构中指令负载不均衡和跨区操作数传递延迟等问题可导致性能惩罚,需要有效的指令调度方法将计算在分区间进行分布.提出了基于数据流块(data-flow block,DFB)的空间指令调度方法.DFB是动态构建、缓存并重用的一个或数个顺序执行的指令基本块的调度模式.DFB调度算法建模动态指令流中的数据流约束和硬件资源定义的调度空间,然后根据指令量化的相对关键性完成调度决策.介绍了DFB调度的微结构框架和算法.通过对分区数、分区间延迟和调度窗口容量等与调度方法密切相关的微结构参数的实验,证明了DFB调度的性能和稳定性优于负载均衡调度和基于依赖的调度.最后举例证明结合一种数据流块缓存实现的DFB调度达到的调度效果接近理想化的DFB调度. Clustered superscalar processors partition hardware resources to circumvent the energy andcycle time penalties incurred by large,monolithic structures.Dynamic multi-core processors fushardware resources of several physical cores to provide the computation capability adapting toapplications.Energy-efficient computation is achieved in these architectures with a carefullyorchestrated utilization of spatially distributed hardware resources.Problems such as instruction load imbalance and operand forwarding latency between partitions m a y cause performance penalties,so an effective spatial instruction scheduling method is needed to distribute the computation among the partitions of spatial architectures.W e present the data-flow block(DFB)based spatial instruction scheduling method.DFB sare dynamically constructed,cached and reused schedule patterns for one or more sequentially executed instruction basic blocks.D F B scheduling algorithm models the data-flow constraints of dynamic instruction stream and the scheduling space defined by hardware resources,then makes the scheduling decision according to the relative criticality,which is the quantitative scheduling slack of instructions.We present the framework and algorithm related to DFB scheduling.Through experimenting with various microbar chitecture parameters closely related to scheduling method such as partition count,inter-partition latency and schedule window capacity,we prove that ideal DFB scheduling performs better and stabler than round-robin and dependence-based scheduling.A t last,wesh ow that the scheduling performance with a DFB cache implementation example closes to ideal D F B scheduling.

作者刘炳涛王达叶笑春范东睿张志敏唐志敏 Liu Bingtao;Wang Da;Ye Xiaochun;Fan Dongrui;Zhang Zhimin;Tang Zhimin(State Key Laboratory of Computer Architecture ( Institute of Computing Technology,Chinese Academy of Sciences),Beijing 100190;School of Computer and Control Engineering,University of Chinese Academy of Sciencss,Beijing100049;Institute of Information and Control,Hangzhou Dianzi University,Hangzhou 310018)

机构地区计算机体系结构国家重点实验室(中国科学院计算技术研究所) 中国科学院大学计算机与控制学院杭州电子科技大学信息与控制研究所

出处《计算机研究与发展》 EI CSCD 北大核心 2017年第4期750-763,共14页 Journal of Computer Research and Development

基金国家重点研发计划项目(2016YFB0200501) 国家自然科学基金项目(61332009 61521092 61671196 61327902) 数学工程与先进计算国家重点实验室开放基金项目(2016A04) 北京市科委科技计划专项项目(Z15010101009)~~

关键词处理器微结构负载均衡旨令调度数据流关键路径 processor microarchitecture load balancing instruction scheduling data-flow critical path

分类号 TP303 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献1

1刘炳涛,王达,叶笑春,张浩,范东睿,张志敏.一种缓存数据流信息的处理器前端设计[J].计算机研究与发展,2016,53(6):1221-1237. 被引量：1

二级参考文献22

1Agarwal V, Hrishikesh M S, Keckler S W, et ah Clock rate versus IPC: The end of the road for conventional microarchitectures [C] //Proc of the 27th Int Symp on Computer Architecture. New York: ACM, 2000 248-259.
2Borkar S, Dubey P, Kahn K, et al. Platform 2015: Intel processor and platform evolution for the next decade [J/OL]. 2015 03 04. http://www, researchgate, net/publication/ 247190040.
3Hill M D, Marty M R. Amdahl's law in the multicore era [J]. Computer, 2008, 41(7): 33-38.
4Hennessy J, Patterson D. Computer Architecture A Quantitative Approach [M]. San Francisco, CA: Morgan Kaufmann, 2011.
5Lee L H, Moyer B, Arends J. Instruction fetch energy reduction using loop caches for embedded applications with small tight loops [C] //Proc of the 1999 Int Syrup on Low Power Electronics and Design. New York: ACM, 1999: 267-269.
6Singhal R. Inside Intel next generation Nehalem microarchitecture [C/OL] //Proc of the 20th Symp of Hot Chips. [2015- 03-04]. http://www, cs. uml. edu/bill/csS15/ Intel_Nehalem Processor. pdf.
7Rotenberg E, Bennett S, Smith J E. Trace cache: A low latency approach to high bandwidth instruction fetching [C] //Proc of the 29th Int Symp on Microarchitecture. Los Alamitos, CA: IEEE Computer Society, 1996:24-35.
8Lempel O. 2nd generation Intel core processor family: Intel core i7, i5 and i3 [C/OL] //Proc of the 23rd Symp of Hot Chips. [2015-03-04]. http://www, hotchips, org/wp-content/ uploads/hc_archives/hc23/HC23. 19.9-Desktop-CPUs/HC23. I9.911-Sandy-Bridge-Lempel-Intel-Rev 07. pdf.
9Black B, Rychlik B, Shen J P. The block based trace cache [C] //Proc of the 26th Int Syrup on Computer Architecture. Los Alamitos, CA: IEEEComputer Society, 1999:196-207.
10Swanson S, Schwerin A, Mercaldi M, et al. The WaveScalar architecture [J]. ACM Trans on Computer Systems, 2007, 25(2) : article 4.

同被引文献33

1汤雅妃,魏进武,张云勇.基于大数据的信令监测系统研究[J].邮电设计技术,2014(7):47-52. 被引量：15
2许佳捷,郑凯,池明旻,朱扬勇,禹晓辉,周晓方.轨迹大数据:数据、应用与技术现状[J].通信学报,2015,36(12):97-105. 被引量：53
3丁盛舟,李永光,杜鹏,孟鑫,温昭琦,王鑫.基于CIM/E的电网调度系统数据质量优化方法[J].电力系统保护与控制,2016,44(3):129-134. 被引量：28
4高振斌,潘亚辰,华中,段小红,赵丹.改进的基于加权最小连接数的负载均衡算法[J].科学技术与工程,2016,16(6):81-85. 被引量：23
5董宏成,郑飞毅.基于OpenFlow的数据中心网络负载均衡算法[J].电子技术应用,2016,42(5):120-123. 被引量：8
6高曙德,狄国荣,苏永刚,高原,张磊,梅东林,陈彦平.ELF电磁仪远程控制和数据传输及编译的实现[J].地震工程学报,2016,38(3):471-477. 被引量：3
7孙三山,汪帅,樊自甫.软件定义网络架构下基于流调度代价的数据中心网络拥塞控制路由算法[J].计算机应用,2016,36(7):1784-1788. 被引量：38
8安子强.复杂网络隐蔽信道资源快速调度方法[J].计算机仿真,2016,33(8):272-275. 被引量：3
9冷明,孙凌宇,朱平.云计算负载均衡任务调度问题的元胞自动机模型研究[J].小型微型计算机系统,2016,37(10):2212-2216. 被引量：5
10林智华,高文,吴春明,李勇燕.基于离散粒子群算法的数据中心网络流量调度研究[J].电子学报,2016,44(9):2197-2202. 被引量：23

引证文献5

1汪保友,姚健,张正卿.基于FKS的信令采集与监控技术[J].电信科学,2018,34(3):145-155. 被引量：5
2陈建彪.基于机器学习的复杂网络数据流均衡调度仿真[J].计算机仿真,2019,36(11):264-267. 被引量：4
3梁玉珠,梅雅欣,杨毅,马樱,贾维嘉,王田.一种基于边缘计算的传感云低耦合方法[J].计算机研究与发展,2020,57(3):639-648. 被引量：8
4王莎莎,黎梓瀚,刘强,陈泽鹏.云存储下系统测试负载数据负载均衡调度仿真[J].计算机仿真,2020,37(7):450-454.
5李琴,张帅.抵抗泄露攻击的Clos网络分布式调度方法[J].宝鸡文理学院学报（自然科学版）,2021,41(1):23-26.

二级引证文献17

1吴良.基于云计算的信令数据监测系统架构研究[J].电子设计工程,2019,27(15):170-174. 被引量：2
2龙嫔.LTE核心网架构下的手机信令数据采集技术研究[J].电子科技,2019,32(12):80-83. 被引量：3
3刘皎,曹荣荣,武立.信令监控系统中数据存储检索功能的设计[J].微型电脑应用,2020,36(6):37-39. 被引量：3
4熊菊霞,吴尽昭.高维数据流异常节点动态跟踪仿真研究[J].计算机仿真,2020,37(10):445-449. 被引量：3
5庞慧,刘丽娟,周丽莉.有效网络大数据流多任务传输调度方法[J].计算机仿真,2020,37(12):391-395. 被引量：2
6孙翠清,徐向阳.基于深度残差网络的电力系统暂态稳定预测[J].计算机仿真,2021,38(2):77-81. 被引量：4
7何山,赵越,乔孟锐.基于机器学习的压缩域图像均衡增强方法[J].计算机仿真,2021,38(3):108-112. 被引量：3
8刘大庆,刘搏方,石乐洋.基于边缘计算的物联网安全架构[J].科学技术创新,2021(31):88-90. 被引量：3
9汪保友,姚赛彬,黄久成,潘晖.基于XDR大数据分析的高架用户识别技术[J].电信科学,2021,37(11):135-142. 被引量：2
10刘艳,王田,彭绍亮,王国军,贾维嘉.基于边缘的联邦学习模型清洗和设备聚类方法[J].计算机学报,2021,44(12):2515-2528. 被引量：14

1张长顺.浅谈超标量处理器的微结构[J].电子计算机,1996(6):2-9.
2邬婕.Pentium4:福音还是困惑[J].微电脑世界,2000(52):36-38.
3吴琼.谈谈传感器技术[J].无线电,1996(9):35-35.
4唐培霞,刘希玉.基于数据流的聚类分析算法研究[J].信息技术与信息化,2007(6):27-29. 被引量：1
5刘彦.求关键路径的一种方法[J].湘潭大学自然科学学报,1991,13(4):146-153.
6杨亮生,黄保卫.硅传感器与微结构[J].上海光机,1990(1):49-55.
7李良良.一种顺序PROLOG机系统结构的研究[J].国防科技大学学报,1989,11(1):22-27.
8杨进钊.Block在计算机制图中的应用[J].陶瓷研究与职业教育,2006,4(4):23-24.
9专属空间 WD Passport便携式硬盘[J].数字生活,2007(7):139-139.
10龙芯新一代处理器架构产品诞生[J].电子产品世界,2015,22(9):19-19.

计算机研究与发展

2017年第4期

浏览历史

内容加载中请稍等...

基于数据流块的空间指令调度方法被引量：5

参考文献1

二级参考文献22

同被引文献33

引证文献5

二级引证文献17

相关作者

相关机构

相关主题

浏览历史

基于数据流块的空间指令调度方法 被引量：5

参考文献1

二级参考文献22

同被引文献33

引证文献5

二级引证文献17

相关作者

相关机构

相关主题

浏览历史

基于数据流块的空间指令调度方法被引量：5