This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We al...This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We also outline an algorithm to realize it and introduce the hardware support we designed. The performance of Ruminate Method is analyzed at the end of this paper with the aid of our preliminary experimental result.展开更多
Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Th...Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Thace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruc-tion Word (VLIW) and superscalar machines. Thace software pipelining applies a global code scheduling technique to compact the original loop body. The re-sulting loop is called a trace software pipelined (TSP) code. The trace softwrae pipelined code can be directly executed with special architectural support or call be transformed into a globally software pipelined loop for the current VLIW and superscalar processors. Thus, exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique. This makes our new technique very promis-ing in practical compilers. Finally, we also present the preliminary experimental results to support our new approach.展开更多
This paper presents a new microarchitecture technique named DYNAMEM,in which memory reference instructions are dynamically scheduled and can be executed out-of-order. Load instructions can bypass store instructions sp...This paper presents a new microarchitecture technique named DYNAMEM,in which memory reference instructions are dynamically scheduled and can be executed out-of-order. Load instructions can bypass store instructions specula-tively, even if the store instructions'addresses are unknown. DYNAMEM can greatly alleviate the restraints of ambiguous memory dependencies. Simulation results show that the frequency of false load is low. Mechanism has been pro-vided to repair false loads with low penalty, and to achieve precise interrupts.Discussions and experimental results show that DYNAMEM could dramatically raise instruction-level parallelism in programs without recompilation.展开更多
文摘This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We also outline an algorithm to realize it and introduce the hardware support we designed. The performance of Ruminate Method is analyzed at the end of this paper with the aid of our preliminary experimental result.
文摘Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Thace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruc-tion Word (VLIW) and superscalar machines. Thace software pipelining applies a global code scheduling technique to compact the original loop body. The re-sulting loop is called a trace software pipelined (TSP) code. The trace softwrae pipelined code can be directly executed with special architectural support or call be transformed into a globally software pipelined loop for the current VLIW and superscalar processors. Thus, exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique. This makes our new technique very promis-ing in practical compilers. Finally, we also present the preliminary experimental results to support our new approach.
文摘This paper presents a new microarchitecture technique named DYNAMEM,in which memory reference instructions are dynamically scheduled and can be executed out-of-order. Load instructions can bypass store instructions specula-tively, even if the store instructions'addresses are unknown. DYNAMEM can greatly alleviate the restraints of ambiguous memory dependencies. Simulation results show that the frequency of false load is low. Mechanism has been pro-vided to repair false loads with low penalty, and to achieve precise interrupts.Discussions and experimental results show that DYNAMEM could dramatically raise instruction-level parallelism in programs without recompilation.