期刊文献+

Unified UDispatch:A User Dispatching Tool for Multicore Systems

Unified UDispatch:A User Dispatching Tool for Multicore Systems
原文传递
导出
摘要 In multicore environment, multithreading is often used to improve application performance. However, even in many simple applications, the performance might degrade when the number of threads increases. Users usually impute this phenomenon to the overhead of creation or termination of threads. In our observation, how the threads are dispatched to the multiple cores might have a more significant effect. We formally defined the problems on using threads as multithreading anomalies, and presented a novel user dispatching mechanism (UDispatch) which provides controllability in user space to improve application performance. Through modification of application source codes with the UDispatch application programming interface (API), the application performance can be improved significantly. However, since the application source codes might not be available or it might be too complicated to modify application source codes, we provided an extension, called UDispatch+, to dispatch threads without any modification of application source codes. In this paper, the UDispatch and UDispatch+ are integrated and wrapped for more portability and introduced as a tool called Unified UDispatch (UUD) with more detailed experiments and description. It can dispatch the application threads to specific cores at the discretion of users with up to 171.8% performance improvement on a 4-core machine. In multicore environment, multithreading is often used to improve application performance. However, even in many simple applications, the performance might degrade when the number of threads increases. Users usually impute this phenomenon to the overhead of creation or termination of threads. In our observation, how the threads are dispatched to the multiple cores might have a more significant effect. We formally defined the problems on using threads as multithreading anomalies, and presented a novel user dispatching mechanism (UDispatch) which provides controllability in user space to improve application performance. Through modification of application source codes with the UDispatch application programming interface (API), the application performance can be improved significantly. However, since the application source codes might not be available or it might be too complicated to modify application source codes, we provided an extension, called UDispatch+, to dispatch threads without any modification of application source codes. In this paper, the UDispatch and UDispatch+ are integrated and wrapped for more portability and introduced as a tool called Unified UDispatch (UUD) with more detailed experiments and description. It can dispatch the application threads to specific cores at the discretion of users with up to 171.8% performance improvement on a 4-core machine.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第3期375-391,共17页 计算机科学技术学报(英文版)
基金 supported in part by the "National Science Council",Taiwan,China,under Grant Nos. NSC-99-2628-E-002-027,NSC-99-2219-E-002-029 the Excellent Research Projects of "National Taiwan University",under Grant No. 99R80300
关键词 MULTITHREADING MULTICORE scheduling DISPATCHING ANOMALY multithreading, multicore, scheduling, dispatching, anomaly
  • 相关文献

参考文献30

  • 1Yang S S, Wang S W, Wu J L. A parallel algorithm for H.264/AVC deblocking filter based on limited error propagation effect. In Proc. IEEE Int. Conf. Multimedia and Expo, Beijing, China, Jul. 2-5, 2007, pp.1858-1861.
  • 2Roitzsch M. Slice-balancing H.264 video encoding for improved scalability of multicore decoding. In Proc. the 7th A CM & IEEE International Conference on Embedded Software, Salzburg, Austriov, Sept. 30-Oct. 5, 2007, pp.269-278.
  • 3Chen Y K, Tian X, Ge S, Girkar M. Towards efficient multi- level threading of H.264 encoder on Intel hyper-threading architectures. In Proe. the 18th Int. Parallel and Distributed Processing Symp., Santa Fe, USA, Apr. 26-30, 2004, p.63.
  • 4Quinn M J. Parallel Programming in C with MPI and OpenMP. McGraw-Hill, 2003.
  • 5Dagum L, Menon R. OpenMP: An industry-standard API for shared-memory programming. IEEE Computational Science & Engineering, Jan. 1998, 5(1): 46-55.
  • 6Lira A W, Cheong G I, Lain M S. An affine partitioning al- gorithm to maximize parallelism and minimize communication. In Proc. the 13th International Conference on Super- Computing, Rhodes, Greece, Jun. 20-25,1999, pp.228-237.
  • 7Lim A W, Lain M S. Maximizing parallelism and minimizing synchronization with affine transforms. In Proc. the 24th ACM SIGPLAN-SIGACT Syrup. Principles of Programming Languages, Paris, Prance, Jan. 15-17, 1997, pp.201-214.
  • 8Graham R L. Bounds on multiprocessing timing anomalies. SIAM Journal of Applied Mathematics,1969, 17(2): 416-429.
  • 9Intel VWuneTM Performance Analyzer. http://www.intel. com/cd/software/products/asino-na/eng/239144.htm, Oct. 2008.
  • 10Tu T H, Hsueh C W, Chang R G. A portable and efficient user dispatching mechanism for multicore systems. In Proc. the 15th International Conference on Real-Time Computing Systems and Applications ( RTCSA 2009), Beijing, China, Aug. 24-26, 2009, pp.427-436.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部