期刊文献+

众核处理器系统核资源动态分组的自适应调度算法 被引量:14

Adaptive Scheduling Algorithm Based on Dynamic Core-Resource Partitions for Many-Core Processor Systems
下载PDF
导出
摘要 针对众核处理器系统的核资源优化使用问题,提出了一种支持核资源动态分组的自适应调度算法CASM(core-partitioned adaptive scheduling for many-core systems).该算法通过对任务簇的拆分与合并,动态构建可弹性分区的核逻辑组,实现核资源的隔离优化访问.为了平衡核资源利用率及任务调度效率,CASM算法针对任务簇间和簇内的不同特点,分别采用公平性较好的均衡调度算法和资源利用率较高的自适应调度算法.在线竞争理论分析表明,CASM算法的任务执行时间在线竞争比为常数2,其性能可扩展性较好.实验结果表明,与WS(work-stealing),AGDEQ(adaptive greedy dynamic equi-partitioning)和EQUI?EQUI算法相比,CASM算法使任务集运行时间分别减少了近46%,32%和15%.在相同能耗情况下,CASM算法大幅度地提升了系统吞吐量. With the aim to address the increasing difficulty of efficiently using large number of cores in many-core processors, a core-partitioned adaptive scheduling algorithm, named CASM (core-partitioned adaptive scheduling for many-core systems), is proposed. CASM dynamically aggregates cores into different partitions by splitting or merging task-clusters, which ensures the efficiency of isolated accessing in these core partitions. To improve the scheduling efficiency of CASM, equi-partitioning scheduling algorithm is adopted to reallocate the cores among task-clusters, and the feedback-driven adaptive scheduling algorithm is implemented within the task-clusters. Online competitive analysis shows that CASM achieves 2-competitiveness ratio with respect to the execution time of parallel jobs, which indicates that CASM has better performance and scalability. The experimental results demonstrate that compared with WS (work-stealing), AGDEQ (adaptive greedy dynamic equi-partitioning) and EQUIoEQUI, CASM reduces the execution time of the same workload by nearly 46%, 32% and 15% respectively. Under the same power consumption, CASM greatly enhances the system throughput.
出处 《软件学报》 EI CSCD 北大核心 2012年第2期240-252,共13页 Journal of Software
基金 国家自然科学基金(61073011 61133004 61173039) 国家高技术研究发展计划(863)(2008AA01A202 2009AA01A131) 中意国际合作项目(2009DFA12110)
关键词 众核处理器 分组调度 自适应调度 竞争分析 高效能计算 many-core processor cluster-based scheduling adaptive scheduling competitive analysis powerefficient computing
  • 相关文献

参考文献1

二级参考文献29

  • 1Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C C, Brown J F, Agarwal A. On-chip interconnection architecture of the Tile processor. IEEE Micro, 2007, 27(5): 15-31
  • 2Tan G, Fan D, Zhang J, Russo A, Gao G R. Experience on optimizing irregular computation for memory hierarchy in manycore architecture//Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Salt Lake City, Utah, USA, 2008: 279-280
  • 3Long G P, Fan D R, Zhang J C, Song F L, Yuan N, Lin W. A performance model of dense matrix operations on manycore architectures//Proceedings of the European Conference on Parallel and Distributed Computing. 2008:120-129
  • 4Lamport L. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, 1979, 28(9): 690-691
  • 5Adve S V, Gharachorloo K. Shared memory consistency models: A tutorial. IEEE Computer, 1996, 29(12): 66-76
  • 6Lenoski D, Laudon J, Gharachorloo K, Gupta A, Hennessy J L. The directory-based cache coherence protocol for the DASH multiprocessor//Proceedings of the International Symposium on Computer Architecture. Seattle, WA, USA, 1990: 148-159
  • 7Iftode L, Singh J P, Li K. Scope consistency: A bridge between release consistency and entry consistency. Theory Computing Systems, 1998, 31(4): 451-473
  • 8胡伟武.共享存储体系结构.北京:高等教育出版社,2001
  • 9Frigo M, Leiserson C E, Randall K H. The implementation of the Cilk-5 mnltithreaded language//Proceedings of the International Symposium on Programming Languages Design and Implementation. Montreal, Canada, 1998:212-223
  • 10Blumofe R D, Leiserson C E. Scheduling multithreaded computations by work stealing//Proceedings of the Annual IEEE Symposium on Foundations of Computer Science. Santa Fe, New Mexico, 1994: 256-368

共引文献6

同被引文献85

引证文献14

二级引证文献61

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部