期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
RUMINATE METHOD-SOFTWARE PIPELINING ON NESTED LOOPS
1
作者 LEI WANG ZHIZHONGTANG and CHIHONG ZHANG(Dept. of Computer Science, Tsinghua Lirnivcrsitg Beijing 100084,P. R. China)(Final: wl,t ang ,zch@est4. dcs. tsinghua.edu. cn) 《Wuhan University Journal of Natural Sciences》 CAS 1996年第Z1期430-436,共7页
This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We al... This paper offers a new method to solve the problem of software pipelininsr on nested loops. We first introduce our new software pipelininog method. Ruminate Method, which can optimize program with nested loops. We also outline an algorithm to realize it and introduce the hardware support we designed. The performance of Ruminate Method is analyzed at the end of this paper with the aid of our preliminary experimental result. 展开更多
关键词 Instruction-level Parallelism software pipeline Ruminate Method Nested Loop.
下载PDF
Trace Software Pipelining
2
作者 王剑 AndreasKrall 《Journal of Computer Science & Technology》 SCIE EI CSCD 1995年第6期481-490,共10页
Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Th... Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches. This paper presents a novel global software pipelining technique, called Thace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruc-tion Word (VLIW) and superscalar machines. Thace software pipelining applies a global code scheduling technique to compact the original loop body. The re-sulting loop is called a trace software pipelined (TSP) code. The trace softwrae pipelined code can be directly executed with special architectural support or call be transformed into a globally software pipelined loop for the current VLIW and superscalar processors. Thus, exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique. This makes our new technique very promis-ing in practical compilers. Finally, we also present the preliminary experimental results to support our new approach. 展开更多
关键词 Instruction-level parallelism fine-grain parallelism software pipelining loop scheduling Very Long Instruction Word (VLIW) superscalar processor
原文传递
Optimized Implementation of the FDK Algorithm on One Digital Signal Processor 被引量:1
3
作者 梁文轩 张辉 胡广书 《Tsinghua Science and Technology》 SCIE EI CAS 2010年第1期108-113,共6页
This paper presents an optimized implementation of the FDK algorithm on a single fixed-point TMS320C6455 digital signal processor (DSP). Software pipelining and proper configuration of the data transfer enables a 25... This paper presents an optimized implementation of the FDK algorithm on a single fixed-point TMS320C6455 digital signal processor (DSP). Software pipelining and proper configuration of the data transfer enables a 2563 volume to be reconstructed in about 42 seconds from 360 projections with very good accuracy. This implementation reveals the potential of modern high-performance DSPs in accelerating image reconstruction, especially when cost and power consumption are emphasized. 展开更多
关键词 computed tomography digital signal processor (DSP) high performance computing software pipelining
原文传递
Leakage-Aware Modulo Scheduling for Embedded VLIW Processors
4
作者 关永 薛京灵 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第3期405-417,共13页
As semi-conductor technologies move down to the nanometer scale, leakage power has become a significant component of the total power consumption. In this paper, we present a leakage-aware modulo scheduling algorithm t... As semi-conductor technologies move down to the nanometer scale, leakage power has become a significant component of the total power consumption. In this paper, we present a leakage-aware modulo scheduling algorithm to achieve leakage energy saving for applications with loops on Very Long Instruction Word (VLIW) architectures. The proposed algorithm is designed to maximize the idleness of function units integrated with the dual-threshold domino logic, and reduce the number of transitions between the active and sleep modes. We have implemented our technique in the Trimaran compiler and conducted experiments using a set of embedded benchmarks from DSPstone and Mibench on the cycle-accurate VLIW simulator of Trimaran. The results show that our technique achieves significant leakage energy saving compared with a previously published DAG-based (Directed Acyclic Graph) leakage-aware scheduling algorithm. 展开更多
关键词 leakage power very long instruction word (VLIW) software pipelining modulo scheduling
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部