期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
AN OBJECT ORIENTED C++ PARALLEL COMPILER SYSTEM
1
作者 XiaoNong Shouren HU(Department of Computer Science, National University of Defense Technology Changsha, HuNan, P.R.China 410073) 《Wuhan University Journal of Natural Sciences》 CAS 1996年第Z1期437-441,共5页
An object-oriented C++ parallel compiler System, called OOCPCS, is developed to facilitate programmers to write sequential programs using C++ or Annotated C++ language for parallel computahon. OOCPCS bases on an integ... An object-oriented C++ parallel compiler System, called OOCPCS, is developed to facilitate programmers to write sequential programs using C++ or Annotated C++ language for parallel computahon. OOCPCS bases on an integrated object-oriented paradigm and large-grain data flow model, called OOLGDFM, and recognizes automatically parallel objects using parallel compiling techniques. The paper describes the object-oriented parallel model and realization of the System on networks. 展开更多
关键词 object-oriented parallel system compiler
下载PDF
Register Allocation Compilation Technique for ASIP in 5G Micro Base Stations
2
作者 Wei Chen Dake Liu Shaohan Liu 《China Communications》 SCIE CSCD 2022年第8期115-126,共12页
The currently available compilation techniques are for general computing and are not optimized for physical layer computing in 5G micro base stations.In such cases,the foreseeable data sizes and small code size are ap... The currently available compilation techniques are for general computing and are not optimized for physical layer computing in 5G micro base stations.In such cases,the foreseeable data sizes and small code size are application specific opportunities for baseband algorithm optimizations.Therefore,the special attention can be paid,for example,the specific register allocation algorithm has not been studied so far.The compilation for kernel sub-routines of baseband in 5G micro base stations is our focusing point.For applications of known and fixed data size,we proposed a compilation scheme of parallel data accessing,while operands can be mainly allocated and stored in registers.Based on a small register group(48×32b),the target of our compilation scheme is the optimization of baseband algorithms based on 4×4 or smaller matrices,maximizing the utilization of register files,and eliminating the extra register data exchanging.Meanwhile,when data is allocated into register files,we used VLIW(Very Long Instruction Word)machine to hide the time of data accessing and minimize the cost of data accessing,thus the total execution time is minimum.Experiments indicate that for algorithms with small data size,the cost of data accessing and extra addressing can be minimized. 展开更多
关键词 parallel data access compilation small size matrix 5G micro base stations register allocation algorithm
下载PDF
Optimizing FORTRAN Programs for Hierarchical Memory Parallel Processing Systems
3
作者 金国华 陈福接 《Journal of Computer Science & Technology》 SCIE EI CSCD 1993年第3期209-220,共12页
Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high perform- ance in parallel pro... Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high perform- ance in parallel processing systems.However,in parallel processing systems with caches or local memo- ries in memory hierarchies,“thrashing problem”may arise whenever data move back and forth between the caches or local memories in different processors. Previous techniques can only deal with the rather simple cases with one linear function in the perfect- ly nested loop.In this paper,we present a parallel program optimizing technique called hybrid loop inter- change(HLI)for the cases with multiple linear functions and loop carried data dependences in the nested loop.With HLI we can easily eliminate or reduce the thrashing phenomena without reducing the program parallelism. 展开更多
关键词 Thrashing problem hierarchical memory CACHE parallelizing compiler hybrid loop interchange FORTRAN
原文传递
A High Speed Signal Processing Machine -Its Architecture, Language and Compiler
4
作者 Wang Yufei and Yu ShiqiBeijing Institute of Data Processing Technology, P.O.Box 3927, Beijing 100039, China 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 1991年第1期119-128,共10页
A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly... A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly and form a ring structure. All processing cells are identical and programmable. Each processing cell has the peak performance of 20 million floating-point operations per second (20MFLOPS). The machine therefore has a peak performance of 320 M FLOPS. It is integrated as an attached processor into a host system through VME bus interface. Programs for FXCQ are written in a high-level language -B language, which is supported by a parallel optimizing compiler. This paper describes the architecture of FXCQ, B language and its compiler. 展开更多
关键词 parallel processing Systolic array processor parallel language Compiler.
下载PDF
Shared Variable Oriented Parallel Precompiler for SPMD Model
5
作者 康继昌 朱怡安 +1 位作者 洪远麟 应必善 《Journal of Computer Science & Technology》 SCIE EI CSCD 1995年第5期476-480,共5页
For the moment, commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers, which are just traditional sequential FORTRAN or C compiler... For the moment, commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers, which are just traditional sequential FORTRAN or C compilers expanded with communication statements. Programmers suffer from writing parallel programs with communication statements.The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD (Single Program Multiple Data) computa-tion model and greatly ease the parallel programming with high communication efficiency. The core function of parallel C precompi1er has been successfully veri-fied on a transputer-based parallel computer. Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique. 展开更多
关键词 parallel computer parallel compiler parallel precompiler communication statement
原文传递
NUAPC: A Parallelizing Compiler for C++
6
作者 朱根江 谢立 孙钟秀 《Journal of Computer Science & Technology》 SCIE EI CSCD 1997年第5期458-459,共2页
This paper presents a model for automatically parallelizing compiler based on C++ which consists of compile-time and run-time parallelizing facilities.The paper also describes a method for finding both intra-object an... This paper presents a model for automatically parallelizing compiler based on C++ which consists of compile-time and run-time parallelizing facilities.The paper also describes a method for finding both intra-object and inter-object parallelism. The parallelism detection is completely transparent to users. 展开更多
关键词 parallelizing compiler data dependence object-oriented programming distributed discrete-event simulation
原文传递
Exploiting Loop Parallelism with Redundant Execution
7
作者 唐卫宇 施武 +1 位作者 臧斌宇 朱传琪 《Journal of Computer Science & Technology》 SCIE EI CSCD 1997年第2期105-112,共8页
In this paper, a new loop transformation is proposed that can exploit parallelism in loops which cannot be found by traditional methods. Then the method is extended to show how to achieve maximum speedup of loops if t... In this paper, a new loop transformation is proposed that can exploit parallelism in loops which cannot be found by traditional methods. Then the method is extended to show how to achieve maximum speedup of loops if there are infinite processors and how to balance the workload of parallel sections in loops i f there is fixed number of processors. 展开更多
关键词 parallel compiler loop transformation parallel section maximum speedup
原文传递
Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors
8
作者 沈美明 田新民 +2 位作者 王鼎兴 郑纬民 温冬婵 《Journal of Computer Science & Technology》 SCIE EI CSCD 1993年第3期233-242,共10页
In this paper,we focus on the compiling implementation of parallel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel G... In this paper,we focus on the compiling implementation of parallel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstract machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed. The performance statistics on a transputer array demonstrate the effectiveness of our model,parallel ab- stract machine,optimizing compilation strategies and compiler. 展开更多
关键词 Declarative language parallel graph rewriting execution model optimized parallel compiler distributed memory multiprocessors parallel abstract machine
原文传递
Task scheduling of parallel programs to optimize communications for cluster of SMPs
9
作者 郑纬民 杨博 +1 位作者 林伟坚 李志光 《Science in China(Series F)》 2001年第3期213-225,共13页
This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be... This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph partition problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication overhead of parallel applications greatly and MMP-Solver outperforms the existing algorithms. 展开更多
关键词 SMP cluster of workstations communication optimization task scheduling graph partition parallelizing compiler
原文传递
Loop Staggering,Loop Compacting:Restructuring Techniques for Thrashing Problem
10
作者 金国华 杨学军 陈福接 《Journal of Computer Science & Technology》 SCIE EI CSCD 1993年第1期49-56,共8页
Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high performance in parallel proce... Parallel loops account for the greatest amount of parallelism in numerical programs.Executing nested loops in parallel with low run-time overhead is thus very important for achieving high performance in parallel processing systems.However,in parallel processing systems with caches or local memories in memory hierarchies,“thrashing problem”may arise when data move back and forth frequently between the caches or local memories in different processors.The techniques associated with parallel compiler to solve the problem are not completely developed.In this paper,we present two restructuring techniques called loop staggering,loop staggering and compacting,with which we can not only eliminate the cache or local memory thrashing phenomena significantly,but also exploit the potential parallelism existing in outer serial loop.Loop staggering benefits the dynamic loop scheduling strategies,whereas loop staggering and compacting is good for static loop scheduling strategies,Our method especially benefits parallel programs,in which a parallel loop is enclosed by a serial loop and array elements are repeatedly used in the different iterations of the parallel loop. 展开更多
关键词 parallelizing compiler loop staggering loop compacting thrashing problem CACHE
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部