期刊文献+

基于SIMD PE阵列的DCT数据并行实现方法研究 被引量:3

Research of Data-Parallel-Implementation Method of DCT Based on SIMD-PE-Array
下载PDF
导出
摘要 本文为满足G级像素帧的实时性处理需求,针对DCT变换计算量大和常规处理中并行度不足的问题,提出一种基于SIMD PE阵列的DCT数据并行实现方法.该方法因PE阵列本身所具有的可裁减特性,可应用于不同并行度需求的嵌入式系统中.文中提出一种基于PE标识的数据并行操作方式,不但解决了局部计算中的"PE自治"问题,又省去了数据寻址时间开销.该操作方式规则、简洁,满足SIMD操作规则性强的要求,符合并行处理技术的发展方向. In order to meet the challenge of real-time processing of G-level pixel frame,to solve such problems as high-volume calculation and lack of parallelism in conventional DCT, a parallel computation method implemented on SIMD PE array for DCT transform is presented in this paper. With the inherited customizability of PE array, this method can be applied to various embedded application systems regardless of their requirements on parallelism. A specific data parallel operation method using PE identifier is proposed. It not only solves the problem of "PE autonomy" in local operation,but also eliminates the cost of time in data addressing. This well-structured and compact data operation satisfies the high-regularity required by SIMD operation, and conforms to the trend of parallel processing technology.
作者 钟升
出处 《电子学报》 EI CAS CSCD 北大核心 2009年第7期1546-1553,共8页 Acta Electronica Sinica
基金 国防微电子预研项目(No.41308010203)
关键词 并行处理 DCT SIMDPE阵列 映射语言 parallel processing DCT SIMD PE array mapping language
  • 相关文献

参考文献11

  • 1HSIASC,LIU B D,et al.VLSI implementation of parallel coefficient-by-coefficient two-dimensional IDCT processor[J].IEEE Transaction,1995,5(5):396-406.
  • 2CHIANG J S,CHUI Y F,et al.A high throughput 2-dimensional DCT/IDCT architecture for real-time and video system[A].Proceedings of the 8th IEEE International Conference on Electronics,Circuits and Systems (ICECS)[C].IEEE Press,2001,Ⅵ.2,867-870.
  • 3CHANG Y T,WANG C L.New systolic array implementation of the 2D discrete cosine transform[J].IEEE Transaction,1995,5(2):150-157.
  • 4Hyesook Lim,Changhoon Yim,et al.Finite wordlength effects of an unified systolic array for 2-d dct/idct[A].Proceedings of the IEEE International Conference on Application-Specific Systems,Architectures,and Processors (ASAP)[C].Washington,DC,USA:IEEE Computer Society,1996,19(21):35-44.
  • 5A Rosenfeld.Parallel image processing using cellular-array computers[J].Computer,1983,1 (1):177 -191.
  • 6K Preston.Cellular logic computers for pattern recognition[J].Gomputer,1983,6(4):36-37.
  • 7Mohamed F Mansour.On the Odd-DFT and its applications to DCT/IDCT computation[J].IEEE transactions,2006,54(7):2819-2822.
  • 8M Narashima,A Peterson.On the computation of the discrete cosine transform[J].IEEE transactions,1978,26 (6):934-936.
  • 9李莉.嵌入式LS-SIMD协处理器芯片的研究与设计[D].西安:西安微电子技术研究所博士学位论文,2002.
  • 10沈绪榜,刘泽响,王茹.计算机体系结构的统一模型[J].计算机学报,2007,30(5):729-736. 被引量:17

二级参考文献17

  • 1沈绪榜,张发存,冯国臣,车得亮,王光.计算机体系结构的分类模型[J].计算机学报,2005,28(11):1759-1766. 被引量:10
  • 2Kapasi Ujval Jet al. The imagine stream processor//Proeeedings of the IEEE International Conference on Computer Design. Freiburg, Germany, 2002:282-288.
  • 3Kung H T. Why systolic architecture? Computer, 1987, 15 (1): 37-46.
  • 4Compton K, Hauek S. Reconfigurable computing: A survey of systems and software. ACM Computing Surveys, 2002, 34(2): 171-210.
  • 5Singh H et al. MorphoSys: An integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Transactions on Computers, 2000, 49(5) : 465-481.
  • 6Lu Guang-Ming. Modeling, implementation and scalability of the MorphoSys dynamically reconfigurable computing architecture [Ph. D. dissertation]. University of California, Irvine, 2000.
  • 7Uldrieh Jack et al. The Next Big Thing is Really Small. New York: Grown Business, 2003.
  • 8Ratner Mark et al. Nanotechnology-A Gentle Introduction to the Next Big Idea. New Jersey: Person Education, Inc. , Prentice Hall, 2003.
  • 9Goser Karl et al. Nanoelectronics and Nanosystems-From Transistors to Molecular and Quantum Devices. Berlin Heidelberg: Springer-Verlag, 2004.
  • 10Macias Nicholas Jet al. Adaptive methods of growing electronic circuits on an imperfect synthetic matrix. Biosystems, 2004, 73(3): 173-204.

共引文献16

同被引文献16

  • 1沈绪榜,刘泽响,王茹.计算机体系结构的统一模型[J].计算机学报,2007,30(5):729-736. 被引量:17
  • 2Kazubek M. Wavelet domain image denoising by thresholding and wiener filtering. IEEE Signal Processing Letters, 2003, 10(11): 324-326.
  • 3Shui Peng-Lang. Image denoising algorithm via doubly local wiener filtering with directional windows in wavelet domain. IEEE Signal Processing Letters, 2005, 12(10): 681-684.
  • 4Daubechies I, Sweldens W. Factoring wavelet transforms into lifting steps. Journal of Fourier Analysis Applications, 1998, 4(3): 245-267.
  • 5Sweldens W. The lifting scheme: A custom-design construction of bi-orthogonal wavelets. Applied and Computational Harmonic Analysis, 1996, 3(2) : 186-200.
  • 6Lian C J, Chen K F. Lifting based discrete wavelet transform architecture for JPEG2000//Proceedings of the 2001 IEEE International Symposium on Circuits and Systems. Piseataway, USA, 2001:445-448.
  • 7Kutil R. A single loop approach to SIMD parallelization of 2D wavelet lifting/ /Proceedings of the 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. Montbellard-Sochaux, France, 2006:413- 420.
  • 8Shahbahrami A, Juurlink B, Vassiliadis S. Implementing the 2D wavelet transform on SIMD-Enhaneed general purpose processors. IEEE Transactions on Multimedia, 2008, 10 (1) : 43-51.
  • 9Tenllado C, Chaver D, Pinuel L, Prieto M, Tirado F. Vectorization of the 2D wavelet lifting transform using SIMD extensions//Proceedings of the Workshop on Parallel and Distributed Image Processing, Video Processing, and Multimedia(PDIVM'03). Nice, France, 2003.
  • 10Mark jutian Maslen. Factoring wavelet transforms into lifting steps [Ph.D. dissertation]. The University of Western Australia, Perth, Australia, 1997.

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部