1JANG B, SCHAA D, MISTRY F, et al. Exploiting memory access patterns to improve memory performance in data- parallel archi- tectures[J]. Parallel and Distributed Systems, IEEE Transactions on,2011,22(1) :105-118.
2TZCNG S,PATNCY A,OWENS J D. Task management for irreg- ular-parallel workloads on the GPUEC. Proceedings of the Confer- ence on High Performance Graphics, 2010 : 29 37.
3YANG Y,XIANG P, KONG J,et al. A GPGPU compiler for mem- ory optimization and parallelism management[J]. ACM Sigplan No- tices,2010,45(6) :86-97.
4YANG Y,XIANG P,KONG J,et al. A unified optimizing compiler framework for different GPGPU architectures[J]. ACM Transac- tions on Architectures and Code ()ptimization,2012,9(2) : 1-33.