期刊文献+

多核机群上数据密集型应用并行程序性能优化

Parallel program performance optimization for data-intensive applications on multi-core clusters
下载PDF
导出
摘要 在异构多核机群系统上利用数据任务块的动态调度策略和全锁定技术,给出一种面向数据密集型应用的结点内主存和可用的共享二级缓存大小中动态调度数据块的多进程级和多线程级并行编程机制,给出了优化数据密集型应用并行程序性能的策略和技术。在多核计算机组成的异构机群上并行求解随机序列多关键字查找的实验结果表明,所给出的多核并行程序设计机制和性能优化方法可行和高效。 Using dynamic data task block scheduling policy and all-locking technology on the heterogeneous multicore clusters, this paper presents a hybrid parallel programming mechanism of multiprocess-level and multithreaded-level for the data-intensive applications, which can efficiently use the data in the main memory and dynamic sched- ule the data block in shared L2 cache, and presents the technology and strategy of paralleled application performance optimization for the data-intensive applications. The experiments for solving the multi-keyword search of random sequences parallelly on the heterogeneous multi-core clusters show that the parallel programming mechanism and performance optimization methods are usable and efficient.
作者 黄华林 钟诚
出处 《计算机工程与应用》 CSCD 2012年第30期73-77,共5页 Computer Engineering and Applications
基金 广西高校优秀人才资助计划(No.RC2007004) 广西研究生教育创新计划(No.105931003036)
关键词 多核机群 多线程 并行编程 性能优化 multi-core cluster multiple threads parallel programming performance optimization
  • 相关文献

参考文献13

二级参考文献34

  • 1Hwang Kai著,王鼎兴等译.高等计算机系统结构--并行性、可扩展性、可编程性.北京:清华大学出版社,1995
  • 2Zhang Yun-Quan, Chen Guo-Liang, Sun Guang-Zhong, Miao Qian-Kun. Models of parallel computation: A survey and classification. Frontiers of Computer Science in China, 2007, 1(2): 156-165
  • 3Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro et al. The landscape of parallel computing research: A view from Berkeley. Electrical Engineering and Computer Sciences, University of California at Berkeley: Technical Report No: UCB/EECS-2006-183, 2006
  • 4Cameron K, Sun X H. Quantifying locality effect in data access delay: Memory LogP//Proceedings of the 2003 IEEE International Parallel and Distributed Processing Symposium (IPDPS'03). Nice, France, 2003:212-219
  • 5Cameron Kirk W, Ge Rong, Sun Xian-He. LognP and Log3P: Accurate analytical models of point-to-point communication in distributed systems. IEEE Transactions on Computers, MARCH 2007, 56(3): 314-327
  • 6Chai Lei, Gao Qi, Panda Dhabaleswar K. Understanding the impact of multi-core architecture in cluster computing: A case study with Intel dual-core system//Proceedings of the 7th IEEE International Symposium on Cluster Computing and the Grid(CCGrid'07). Rio de Janeiro, Brazil, 2007:471-478
  • 7Alam Sadaf R, Barrett Richard F, Kuehn Jeffery A, Roth Philip C, Vetter Jeffrey S. Characterization of scientific workloads on systems with multi-core processors//Proceedings of the International Symposium on Workload Characterization. Los Alamitos, CA, USA, 2006:225-235
  • 8Kielmann Thilo, Bal Henri E. Fast measurement of LogP parameters for message passing platforms//Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing. Cancun, Mexico, 2000: Lecture Notes in Computer Science, Springer-Verlag, London, UK, 2000: 1176- 1183
  • 9Torsten Hoefler, Andre Lichei, Wolfgang Rehm. Low-overhead LogGP parameter assessment for modern interconnection networks//Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS'07). Long Beach, California, USA, 2007:403-410
  • 10Jelena Pjesivae-Grbovic, Thara Angskun, George Bosilcad et al. Performance analysis of MPI collective operations// Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05). Denver, CA, USA, 2005:272-279

共引文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部