In this paper, the function characteristics of instruction are introduced, and the focus is on contact compare instruction and the characteristics of the transfer instructions. With the traffic lights control as an ex...In this paper, the function characteristics of instruction are introduced, and the focus is on contact compare instruction and the characteristics of the transfer instructions. With the traffic lights control as an example, this paper proposes the application of contact compare instruction and transfer instruction to program design method, making the program structure compact and statement concise. The control requirements are easily completed. Key words: Function Instruction; Programming; Traffic Lights; Contact Compare Instruction; Transfer展开更多
The combination of growing transistor counts and limited power budget within a silicon die leads to the utilization wall problem (a.k.a. "Dark Silicon"), that is only a small fraction of chip can run at full speed...The combination of growing transistor counts and limited power budget within a silicon die leads to the utilization wall problem (a.k.a. "Dark Silicon"), that is only a small fraction of chip can run at full speed during a period of time. Designing accelerators for specific applications or algorithms is considered to be one of the most promising approaches to improving energy-efficiency. However, most current design methods for accelerators are dedicated for certain applications or algorithms, which greatly constrains their applicability. In this paper, we propose a novel general-purpose many-accelerator architecture. Our contributions are two-fold. Firstly, we propose to cluster dataflow graphs (DFGs) of hotspot basic blocks (BBs) in applications. The DFG clusters are then used for accelerators design. This is because a DFC is the largest program unit which is not specific to a certain application. We analyze 17 benchmarks in SPEC CPU 2006, acquire over 300 DFGs hotspots by using LLVM compiler tool, and divide them into 15 clusters based on graph similarity. Secondly, we introduce a function instruction set architecture (FISC) and illustrate how DFG accelerators can be integrated with a processor core and how they can be used by applications. Our results show that the proposed DFG clustering and FISC design can speed up SPEC benchmarks 6.2X on average.展开更多
文摘In this paper, the function characteristics of instruction are introduced, and the focus is on contact compare instruction and the characteristics of the transfer instructions. With the traffic lights control as an example, this paper proposes the application of contact compare instruction and transfer instruction to program design method, making the program structure compact and statement concise. The control requirements are easily completed. Key words: Function Instruction; Programming; Traffic Lights; Contact Compare Instruction; Transfer
基金supported by the National Natural Science Foundation of China under Grant Nos.601173006,61221062the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No.XDA06010403
文摘The combination of growing transistor counts and limited power budget within a silicon die leads to the utilization wall problem (a.k.a. "Dark Silicon"), that is only a small fraction of chip can run at full speed during a period of time. Designing accelerators for specific applications or algorithms is considered to be one of the most promising approaches to improving energy-efficiency. However, most current design methods for accelerators are dedicated for certain applications or algorithms, which greatly constrains their applicability. In this paper, we propose a novel general-purpose many-accelerator architecture. Our contributions are two-fold. Firstly, we propose to cluster dataflow graphs (DFGs) of hotspot basic blocks (BBs) in applications. The DFG clusters are then used for accelerators design. This is because a DFC is the largest program unit which is not specific to a certain application. We analyze 17 benchmarks in SPEC CPU 2006, acquire over 300 DFGs hotspots by using LLVM compiler tool, and divide them into 15 clusters based on graph similarity. Secondly, we introduce a function instruction set architecture (FISC) and illustrate how DFG accelerators can be integrated with a processor core and how they can be used by applications. Our results show that the proposed DFG clustering and FISC design can speed up SPEC benchmarks 6.2X on average.