期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Understanding co-run performance on CPU-GPU integrated processors: observations, insights, directions 被引量:1
1
作者 Qi ZHU Bo WU +3 位作者 Xipeng SHEN Kai SHEN Li SHEN Zhiying WANG 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第1期130-146,共17页
Recent years have witnessed a processor develop- ment trend that integrates central processing unit (CPU) and graphic processing unit (GPU) into a single chip. The inte- gration helps to save some host-device data... Recent years have witnessed a processor develop- ment trend that integrates central processing unit (CPU) and graphic processing unit (GPU) into a single chip. The inte- gration helps to save some host-device data copying that a discrete GPU usually requires, but also introduces deep re- source sharing and possible interference between CPU and GPU. This work investigates the performance implications of independently co-running CPU and GPU programs on these platforms. First, we perform a comprehensive measurement that covers a wide variety of factors, including processor ar- chitectures, operating systems, benchmarks, timing mecha- nisms, inputs, and power management schemes. These mea- surements reveal a number of surprising observations. We an- alyze these observations and produce a list of novel insights, including the important roles of operating system (OS) con- text switching and power management in determining the program performance, and the subtle effect of CPU-GPU data copying. Finally, we confirm those insights through case studies, and point out some promising directions to mitigate anomalous performance degradation on integrated heteroge- neous processors. 展开更多
关键词 performance analysis GPGPU co-run degrada-tion fused processor program transformation
原文传递
Resolving the GPU responsiveness dilemma through program transformations
2
作者 Qi ZHU Bo WU +3 位作者 Xipeng SHEN Kai SHEN Li SHEN Zhiying WANG 《Frontiers of Computer Science》 SCIE EI CSCD 2018年第3期545-559,共15页
The emerging integrated CPU-GPU architectures facilitate short computational kernels to utilize GPU acceleration. Evidence has shown that, on such systems, the GPU control responsiveness (how soon the host program fi... The emerging integrated CPU-GPU architectures facilitate short computational kernels to utilize GPU acceleration. Evidence has shown that, on such systems, the GPU control responsiveness (how soon the host program finds out about the completion of a GPU kernel) is essential for the overall performance. This study identifies the GPU responsiveness dilemma: host busy polling responds quickly, but at the expense of high energy consumption and interference with co-running CPU programs; interrupt-based notification minimizes energy and CPU interference costs, but suffers from substantial response delay. We present a programlevel solution that wakes up the host program in anticipation of GPU kernel completion. We systematically explore the design space of an anticipatory wakeup scheme through a timerdelayed wakeup or kernel splitting-based pre-completion notification. Experiments show that our proposed technique can achieve the best of both worlds, high responsiveness with low power and CPU costs, for a wide range of GPU workloads. 展开更多
关键词 program transformation GPU integrated architecture RESPONSIVENESS
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部