摘要
This paper presents an instruction scheduling and cluster assignment approach for clustered very long instruction words (VLIW) processors. The technique produces high performance code by simultaneously balancing instructions among clusters and minimizing the amount of inter-cluster data communications. The scheme is evaluated based on benchmarks extracted from UTDSP. Results show a significant speedup compared with previously used techniques with speed-ups of up to 44%, with average speed-ups ranging from 14% (2-cluster) to 18% (4-cluster).
This paper presents an instruction scheduling and cluster assignment approach for clustered very long instruction words (VLIW) processors. The technique produces high performance code by simultaneously balancing instructions among clusters and minimizing the amount of inter-cluster data communications. The scheme is evaluated based on benchmarks extracted from UTDSP. Results show a significant speedup compared with previously used techniques with speed-ups of up to 44%, with average speed-ups ranging from 14% (2-cluster) to 18% (4-cluster).
基金
Supported by the National Natural Science Foundation of China(No. 60236030)
the National Research Foundation for the Doctoral Program of Higher Education of China (No. 20050003083)
the Tsinghua Basic Research Foundation