期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
并行仿真系统中多处理器数的确定
1
作者 洪波 陈德来 翁史烈 《计算机仿真》 CSCD 1998年第1期23-25,共3页
对不同规模、不同要求的问题进行并行计算.应选用不同数量的处理器参与计算才能获得最佳的并行处理性能。本文以连续系统并行仿真为例.提出适合于并行仿真系统的多处理器数确定的方法。
关键词 多处理器数 系统仿真 并行仿真系统
下载PDF
Efficient SIMD optimization for media processors
2
作者 Jian-peng ZHOU Ce SHI 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2008年第4期524-530,共7页
Single instruction multiple data (SIMD) instructions are often implemented in modem media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIM... Single instruction multiple data (SIMD) instructions are often implemented in modem media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIMD instructions. This paper focuses on SIMD instructions generation for media processors. We present an efficient code optimization approach that is integrated into a retargetable C compiler. SIMD instructions are generated by finding and combining the same operations in programs. Experimental results for the UltraSPARC VIS instruction set show that a speedup factor up to 2.639 is obtained. 展开更多
关键词 Retargetable compiler Single instruction multiple data (SIMD) instruction LCC
下载PDF
Research and Design of Monolithic Decision Circuit for Optical Communication System
3
作者 ZHANGYaqi ZHAOJie 《Semiconductor Photonics and Technology》 CAS 1997年第4期262-268,共7页
In this paper,the cause of bit-error is analyzed when data are decided in the optical receiver.A monolithic D-ff decision circuit is designed.It can work effectively at 622 Mb/s.Moreover,a decision method of parallel ... In this paper,the cause of bit-error is analyzed when data are decided in the optical receiver.A monolithic D-ff decision circuit is designed.It can work effectively at 622 Mb/s.Moreover,a decision method of parallel processing to improve the decision speed is presented,through which the parallel circuit can work up to 1 Gb/s using the same model.With the technique,higher-speed data can be decided by using lower speed device. 展开更多
关键词 BER D-ff Decision Circuit MULTIPLEXER Parallel Processing
下载PDF
Optimizing pipeline for a RISC processor with multimedia extension ISA 被引量:1
4
作者 肖志斌 刘鹏 +1 位作者 姚英彪 姚庆栋 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第2期269-274,共6页
The 32-bit extensible embedded processor RISC3200 originating from an RTL prototype core is intended for low-cost consumer multimedia products. In order to incorporate the reduced instruction set and the multimedia ex... The 32-bit extensible embedded processor RISC3200 originating from an RTL prototype core is intended for low-cost consumer multimedia products. In order to incorporate the reduced instruction set and the multimedia extension instruction set in a unifying pipeline, a scalable super-pipeline technique is adopted. Several other optimization techniques are proposed to boost the frequency and reduce the average CPI of the unifying pipeline. Based on a data flow graph (DFG) with delay information, the critical path of the pipeline stage can be located and shortened. This paper presents a distributed data bypass unit and a centralized pipeline control scheme for achieving lower CPI. Synthesis and simulation showed that the optimization techniques enable RISC3200 to operate at 200 MHz with an average CPI of 1.16. The core was integrated into a media SOC chip taped out in SMIC 0.18-micron technology. Preliminary testing result showed that the processor works well as we expected. 展开更多
关键词 PIPELINE RISC Single-instruction-multiple-data (SIMD) Instruction set architecture (ISA) Multimedia extension
下载PDF
Performance Behaviour Analysis of the Present 3-Level Cache System for Multi-Core Processors
5
作者 Muhammad Ali Ismail 《Computer Technology and Application》 2012年第11期729-733,共5页
In this paper, a study related to the expected performance behaviour of present 3-level cache system for multi-core systems is presented. For this a queuing model for present 3-level cache system for multi-core proces... In this paper, a study related to the expected performance behaviour of present 3-level cache system for multi-core systems is presented. For this a queuing model for present 3-level cache system for multi-core processors is developed and its possible performance has been analyzed with the increase in number of cores. Various important performance parameters like access time and utilization of individual cache at different level and overall average access time of the cache system is determined. Results for up to 1024 cores have been reported in this paper. 展开更多
关键词 MULTI-CORE memory hierarchy cache access time queuing analysis.
下载PDF
Implementation of Adaptive Wavelet Thresholding Denoising Algorithm Based on DSP
6
作者 张雪峰 康春霞 +1 位作者 裴峰 张志杰 《Journal of Measurement Science and Instrumentation》 CAS 2011年第3期272-275,共4页
By utilizing the capability of high-speed computing,powerful real-time processing of TMS320F2812 DSP,wavelet thresholding denoising algorithm is realized based on Digital Signal Processors.Based on the multi-resolutio... By utilizing the capability of high-speed computing,powerful real-time processing of TMS320F2812 DSP,wavelet thresholding denoising algorithm is realized based on Digital Signal Processors.Based on the multi-resolution analysis of wavelet transformation,this paper proposes a new thresholding function,to some extent,to overcome the shortcomings of discontinuity in hard-thresholding function and bias in soft-thresholding function.The threshold value can be abtained adaptively according to the characteristics of wavelet coefficients of each layer by adopting adaptive threshold algorithm and then the noise is removed.The simulation results show that the improved thresholding function and the adaptive threshold algorithm have a good effect on denoising and meet the criteria of smoothness and similarity between the original signal and denoising signal. 展开更多
关键词 Mallat algorithm wavelet denoising thresholding function adaptive threshold Digital Signal Processors
下载PDF
An Adaptive Genetic Algorithm for Multiprocessor Real-time Task Scheduling
7
作者 李亚军 杨宇航 《Journal of Donghua University(English Edition)》 EI CAS 2009年第2期111-118,共8页
Real-time task scheduling is of primary significance in multiprocessor systems.Meeting deadlines and achieving high system utilization are the two main objectives of task scheduling in such systems.In this paper,we re... Real-time task scheduling is of primary significance in multiprocessor systems.Meeting deadlines and achieving high system utilization are the two main objectives of task scheduling in such systems.In this paper,we represent those two goals as the minimization of the average response time and the average task laxity.To achieve this,we propose a genetic-based algorithm with problem-specific and efficient genetic operators.Adaptive control parameters are also employed in our work to improve the genetic algorithms' efficiency.The simulation results show that our proposed algorithm outperforms its counterpart considerably by up to 36% and 35% in terms of the average response time and the average task laxity,respectively. 展开更多
关键词 SCHEDULING genetic algorithm REAL-TIME DEADLINE
下载PDF
Towards high-performance packet processing on commodity multi-cores: current issues and future directions 被引量:5
8
作者 TANG Lu YAN JinLi +2 位作者 SUN ZhiGang LI Tao ZHANG MinXuan 《Science China Chemistry》 SCIE EI CAS CSCD 2015年第12期24-39,共16页
The demands of programmability have become more and more exigent as novel network services appear, such as E-commerce, social softwares, and online videos. Commodity multi-core CPUs have been widely applied in network... The demands of programmability have become more and more exigent as novel network services appear, such as E-commerce, social softwares, and online videos. Commodity multi-core CPUs have been widely applied in network packet processing to get high programmability and reduce the time-to-market. However,there is a great gap between the packet processing performance of commodity multi-core and that of the traditional packet processing hardware, e.g., NP(Network Process). Recently, optimization of the packet processing performance of commodity multi-cores has become a hot topic in industry and academia. In this paper, based on a detailed analysis of the packet processing procedure, firstly we identify two dominating overheads, namely the virtual-to-physical address translation and the packet buffer management. Secondly, we make a comprehensive survey on the current optimization methods. Thirdly, based on the survey, the heterogeneous architecture of the commodity multi-core + FPGA is proposed as a promising way to improve the packet processing performance.Fourthly, a novel Self-Described Buffer(SDB) management technology is introduced to eliminate the overheads of the allocation and deallocation of the packet buffers offloaded to FPGA. Then, an evaluation testbed, named PIOT(Packet I/O Testbed), is designed and implemented to evaluate the packet forwarding performance. I/O capacity of different commodity multi-core CPUs and the performance of optimization methods are assessed and compared based on PIOT. At last, the future work of packet processing optimization on multi-core CPUs is discussed. 展开更多
关键词 network commodity multi-core packet processing SKB DPDK
原文传递
Precise Zero-Knowledge Arguments with Poly-logarithmic Efficiency
9
作者 丁宁 谷大武 《Journal of Shanghai Jiaotong university(Science)》 EI 2009年第5期584-589,共6页
Precise zero-knowledge was introduced by Micali and Pass in STOC06. This notion captures the idea that the view of a verifier can be reconstructed in almost same time. Following the notion, they constructed some preci... Precise zero-knowledge was introduced by Micali and Pass in STOC06. This notion captures the idea that the view of a verifier can be reconstructed in almost same time. Following the notion, they constructed some precise zero-knowledge proofs and arguments, in which the communicated messages are polynomial bits. In this paper, we employ the new simulation technique introduced by them to provide a precise simulator for a modified Kilian's zero-knowledge arguments with poly-logarithmic efficiency (this modification addressed by Rosen), and as a result we show this protocol is a precise zero-knowledge argument with poly-logaxithmic efficiency. We also present an alternative construction of the desired protocols. 展开更多
关键词 CRYPTOGRAPHY ZERO-KNOWLEDGE precise zero-knowledge
原文传递
Novel algorithm for complex bit reversal:employing vector permutation and branch reduction methods
10
作者 Feng YU Ze-ke WANG Rui-feng GE 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2009年第10期1492-1499,共8页
We present novel vector permutation and branch reduction methods to minimize the number of execution cycles for bit reversal algorithms.The new methods are applied to single instruction multiple data(SIMD) parallel im... We present novel vector permutation and branch reduction methods to minimize the number of execution cycles for bit reversal algorithms.The new methods are applied to single instruction multiple data(SIMD) parallel implementation of complex data floating-point fast Fourier transform(FFT).The number of operational clock cycles can be reduced by an average factor of 3.5 by using our vector permutation methods and by 1.1 by using our branch reduction methods,compared with conventional im-plementations.Experiments on MPC7448(a well-known SIMD reduced instruction set computing processor) demonstrate that our optimal bit-reversal algorithm consistently takes fewer than two cycles per element in complex array operations. 展开更多
关键词 Bit reversal Vector permutation Branch reduction Single instruction multiple data (SIMD) Fast Fourier transform (FFT)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部