期刊文献+

性能非对称多核处理器上的自适应调度 被引量:1

Adaptive Scheduling on Performance Asymmetric Multicore Processors
下载PDF
导出
摘要 现有的性能非对称多核调度算法要么不能充分利用其体系结构而吞吐量低,要么能充分利用其体系结构但扩展性差.有些算法即使考虑了扩展性,但也局限于CPU核数目,没有考虑到任务数方面的扩展性.为了解决这些问题,作者提出了一个自适应调度算法(称为AS4AMS).在任务的每一次调度中,AS4AMS首先通过分析任务运行时的平均停驻时间得出任务的计算需求,然后根据这些需求以及各CPU核的负载情况将任务分配到合适的CPU核上运行.另外,该算法任务结束前,会不断重复上述过程以适应任务需求的不断变化.实验结果表明:与现有方法相比,所提出的方法扩展性更好并且吞吐量也更大. Existing scheduling algorithms for performance asymmetric multicore systems either have low throughput or have bad scalability.Though scalability is considered in some algorithms,it is only confined to the number of cores,ignoring the scalability with respect to the number of tasks.To address these problems,an adaptive scheduling algorithm for performance asymmetric multicore systems,called AS4AMS,is proposed.By analyzing tasks' average stall time,AS4AMS obtains tasks' computing requirements,and then tasks are assigned to appropriate cores according to both the requirements of the tasks and the load of the cores.In addition,the above procedure is repeated to accommodate phase changes of tasks.Our experiment results show that as compared to existing algorithms,the newly proposed method delivers both higher scalability and greater throughput.
出处 《计算机学报》 EI CSCD 北大核心 2013年第4期773-781,共9页 Chinese Journal of Computers
基金 国家"九七三"重点基础研究发展规划项目基金(2010CB328102) 国家自然科学基金(61133001 61003078 61272117 61272118 61202038)资助~~
关键词 多核处理器 性能非对称 操作系统 调度 multicore processor performance asymmetric operating systems scheduling
  • 相关文献

参考文献12

  • 1Barroso Luiz Andre et al. Piranha:A scalable architecturebased on single-chip multiprocessing//Proceedings of the27th International Symposium on Computer Architecture(ISCA,00). Vancouver, British Columbia, Canada,2000:282-293.
  • 2Hammond Lance, Nayfeh A Basem, Olukotun Kunle. Asingle-chip multiprocessor. IEEE Computer, 1997, 30(9):79-85.
  • 3Flynn M J, Hung Patrick. Microprocessor design issues:Thoughts on the road ahead. IEEE Micro,2005,25(3):16-31.
  • 4Annavaram Murali, Grochowski Ed, Shen John. MitigatingAmdahl,s law through EPI throttling//Proceedings of the32nd Annual International Symposium on Computer Archi-tecture (ISCA,05). Washington, 2005:298-309.
  • 5Kumar R,Farkas K I,Jouppi N P,Ranganathan P,TullsenD M. Single-ISA heterogeneous multi-core architectures:The potential for processor power reduction//Proceedings ofthe 36th International Symposium on Microarchitecture(MICRO,03). Washington, 2003:81-92.
  • 6Li T, Brett P, Knauerhase R, Koufaty D, Reddy D, Hahn S.Operating system support for overlapping-ISA heterogeneousmulti-core arch1tectures//Proceedings of the 2010 IEEE 16thInternational Symposium on High Performance ComputerArchitecture CHPCA? 10). Washington, 2010:1-12.
  • 7Li T, Baumberger D, Koufaty D A, Hahn S. Efficient oper-ating system scheduling for performance-asymmetric multi-core architectures//Proceedings of the 2007 ACM/IEEEConference on Supercomputing (SCr 07). New York, 2007 :M1.
  • 8Kumar R, Tullsen D M,Ranganathan P,Jouppi N P,Farkas K I. Single-ISA heterogeneous multi-core architec-tures for multithreaded workload performance//Proceedingsof the 31st Annual International Symposium on ComputerArchitecture (ISCA,04). Washington,2004:64-75.
  • 9Becchi M, Crowley P. Dynamic thread assignment on hetero-geneous multiprocessor architectures//Proceedings of the32nd Annual International Symposium on Computer Archi-tecture (ISCA? 05). Ischia, Italy,2005:29-40.
  • 10Shelepov D,Saez J C,Jeffery S,Fedorova A, Perez N,Huang Z F, Blagodurov S,Kumar V. Hass:A scheduler forheterogeneous multicore systems, Operating SystemsReview, 2009,43(2):66-75.

同被引文献25

  • 1Bhatele A. Automating Topology Aware Mapping for Super- computers pPh. D. dissertation]. Department of Computer Science, University of Illinois, Illinois, USA, 2010.
  • 2Hoefler T, Snir M. Generic topology mapping strategies for large-scale parallel architectures/ /Proceedings of the 25th International Conference on Supereomputing. Tucson, USA, 2011:75-84.
  • 3Bhatele A, Kal6 L V. An evaluative study on the effect of contention on message latencies in large supereomputers// Proceedings of the 23th IEEE International Parallel y. Distributed Processing Symposium (IPDPS). Rome, Italy, 2009 1-8.
  • 4Sahni S, Gonzalez T. P-complete approximation problems. Journal of the Association of Computing Machinery, 1976, 23(3): 555-565.
  • 5Ercal F, Ramanujam J, Sadayappan P. Task allocation onto a hypercube by recursive mincut bipartitioning. Journal of Parallel and Distributed Computing, 1990, 10(1); 35-44.
  • 6Brandfass B, Alrutz T, Gerhold T. Rank reordering for MPI communication optimization. Computers Fluids, 2013, 80 (Complete) : 372-380.
  • 7Mercier G, Clet-Ortega J. Towards an efficient process placement policy for MPI applications in multicore environ- ments//Proceedings of the 16th European PVM/MPI Users' Group Meeting. Espoo, Finland, 2009:104-115.
  • 8Chen H, Chen W, Huang J, et al. MPIPP: An automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters//Proceedings of the 20th Annual International Conference on Supercomputing. Queensland, Australia, 2006:353-360.
  • 9Jeannot E, Mercier G. Near-optimal placement of MPI processes on hierarchical NUMA architectures//Proceedings of the 16th International Euro-Par Conference on Parallel Processing. Ischia, Italy, 2010:199-210.
  • 10Subramoni H, Potluri S, Kandalla K, et al. Design of a scalable InfiniBand topology service to enable network-topology- aware placement of processes//Proceedings of the International Conference for High Performance Computing. Salt Lake City, USA, 2012:1-12.

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部