In theory, branch predictors with more compli- cated algorithms and larger data structures provide more accurate predictions. Unfortunately, overly large structures and excessively complicated algorithms cannot be imp...In theory, branch predictors with more compli- cated algorithms and larger data structures provide more accurate predictions. Unfortunately, overly large structures and excessively complicated algorithms cannot be implemented because of their long access delay. To date, many strategies have been proposed to balance delay with accuracy, but none has completely solved the issue. The architecture for ahead branch prediction (A2BP) separates traditional pre- dictors into two parts. First is a small table located at the front-end of the pipeline, which makes the prediction brief enough even for some aggressive processors. Second, oper- ations on complicated algorithms and large data structures for accurate predictions are all moved to the back-end of the pipeline. An effective mechanism is introduced for ahead branch prediction in the back-end and small table update in the front. To substantially improve prediction accuracy, an indirect branch prediction algorithm based on branch history and target path (BHTP) is implemented in AZBE Experiments with the standard performance evaluation corpora- tion (SPEC) benchmarks on gem5/SimpleScalar simulators demonstrate that AzBP improves average performance by 2.92% compared with a commonly used branch target bufferbased predictor. In addition, indirect branch misses with the BHTP algorithm are reduced by an average of 28.98% com- pared with the traditional algorithm.展开更多
The Godson project is the first attempt to design high performancegeneral-purpose microprocessors in China. This paper introduces the microarchitecture of theGodson-2 processor which is a 64-bit, 4-issue, out-of-order...The Godson project is the first attempt to design high performancegeneral-purpose microprocessors in China. This paper introduces the microarchitecture of theGodson-2 processor which is a 64-bit, 4-issue, out-of-order execution RISC processor that implementsthe 64-bit MIPS-like instruction set. The adoption of the aggressive out-of-order executiontechniques (such as register mapping, branch prediction, and dynamic scheduling) and cachetechniques (such as non-blocking cache, load speculation, dynamic memory disambiguation) helps theGodson-2 processor to achieve high performance even at not so high frequency. The Godson-2 processorhas been physically implemented on a 6-metal 0.18 μm CMOS technology based on the automaticplacing and routing flow with the help of some crafted library cells and macros. The area of thechip is 6,700 micrometers by 6,200 micrometers and the clock cycle at typical corner is 2.3 ns.展开更多
文摘In theory, branch predictors with more compli- cated algorithms and larger data structures provide more accurate predictions. Unfortunately, overly large structures and excessively complicated algorithms cannot be implemented because of their long access delay. To date, many strategies have been proposed to balance delay with accuracy, but none has completely solved the issue. The architecture for ahead branch prediction (A2BP) separates traditional pre- dictors into two parts. First is a small table located at the front-end of the pipeline, which makes the prediction brief enough even for some aggressive processors. Second, oper- ations on complicated algorithms and large data structures for accurate predictions are all moved to the back-end of the pipeline. An effective mechanism is introduced for ahead branch prediction in the back-end and small table update in the front. To substantially improve prediction accuracy, an indirect branch prediction algorithm based on branch history and target path (BHTP) is implemented in AZBE Experiments with the standard performance evaluation corpora- tion (SPEC) benchmarks on gem5/SimpleScalar simulators demonstrate that AzBP improves average performance by 2.92% compared with a commonly used branch target bufferbased predictor. In addition, indirect branch misses with the BHTP algorithm are reduced by an average of 28.98% com- pared with the traditional algorithm.
文摘The Godson project is the first attempt to design high performancegeneral-purpose microprocessors in China. This paper introduces the microarchitecture of theGodson-2 processor which is a 64-bit, 4-issue, out-of-order execution RISC processor that implementsthe 64-bit MIPS-like instruction set. The adoption of the aggressive out-of-order executiontechniques (such as register mapping, branch prediction, and dynamic scheduling) and cachetechniques (such as non-blocking cache, load speculation, dynamic memory disambiguation) helps theGodson-2 processor to achieve high performance even at not so high frequency. The Godson-2 processorhas been physically implemented on a 6-metal 0.18 μm CMOS technology based on the automaticplacing and routing flow with the help of some crafted library cells and macros. The area of thechip is 6,700 micrometers by 6,200 micrometers and the clock cycle at typical corner is 2.3 ns.