期刊文献+

基于C6000的数据存储处理编程优化方法

Programming Optimization Method in Data Storage Processing Based on C6000
下载PDF
导出
摘要 针对TI TMS320C6000系列数字信号处理器容易忽视的CPU指令并行、软件流水特点和编译器内联函数、线性汇编及汇编语言的高速运行特点,给出常见的指令存储器相关性问题、循环冗余问题、嵌套循环的流水性能问题和程序存储块冲突问题的分析与优化解决方法。以定点点积算法进行参照实验,结果证明代码运行速度在进行相应C程序编程优化、线性汇编编程优化和手工汇编编程优化后最高分别可以提高85.9%、86.4%和93.1%。 Using traditional programming methods on the TMS320C6000 Digital Signal Processor(DSP) may omit the advantages of parallel instructions, software pipelining and the high-speed performance of intrinsics, linear assembly language and assembly language, so the analysis and solving methods of common problems of instruction storage correlation, circulation abundance, the performance of nested loop and interleaved memory conflicting are provided. The reference experience with fixed-point dot product programs shows that the code running speed can be separately increased by 85.9%, 86.4% and 93.1% after optimized by optimization methods of C programming, linear assembly programming and hand assembly programming.
作者 苑玮琦 王斌
出处 《计算机工程》 CAS CSCD 2012年第17期276-279,283,共5页 Computer Engineering
基金 国家自然科学基金资助项目(60972123) 沈阳市科技计划基金资助项目(F10-213-1-00)
关键词 数字信号处理器 编程优化 指令并行 软件流水 线性汇编 数据存储 Digital Signal Processor(DSP) programming optimization instruction parallel software pipelining linear assembly data storage
  • 相关文献

参考文献8

  • 1Hyde R. Write Great Code, Vol. 2: Yhinking Low-Level, Writing High-Level[M]. San Francisco, USA: No Starch Press, 2006.
  • 2Texas Instruments. TMS320C6000 Technical Brief[EB/OL]. [2011-09-02]. http://www.ti.com/lit/ug/spru 197d/spru 197d.pdf.
  • 3Texas Instruments. TMS320C6000 Programmer's Guide[EB/OL]. [2011-09-06]. htlp://www.ti.com.cn/crdlit/ug/spru 198k/spru 198k. pdf.
  • 4Lam M. Software Pipelining: An Effective Scheduling Technique for VLIW Machines[C]//Proc. of Conference on Programming Language Design and Implementation. Atlanta, USA: Is. n.], 1988.
  • 5EckelB.c++编程思想[M].2版.刘宗田,译.北京:机械工业出版社,2002.
  • 6Texas Instruments. TMS320C6000 Optimizing Compiler v6.1 User's Guide[EB/OL], [2011-09-15]. http://www.ti.com/lit/ug/ spru 187o/spru 187o.pdf.
  • 7黄铠.高级计算机体系结构[M].北京:机械工业出版社,1999.30-58.
  • 8左颢睿,张启衡,徐勇,赵汝进.基于GPU的并行优化技术[J].计算机应用研究,2009,26(11):4115-4118. 被引量:23

二级参考文献11

  • 1NVIDIA. NVIDIA CUDA programming guide version 1.1 [ EB/OL]. (2007-01). http://www. nvidia. com/object/cuda_home, html.
  • 2HARADA T. Real-time rigid body simulation on GPUs [ M ]. [ S. l. ] : Addison Wesley Professional, 2007:611- 632.
  • 3NYLAND L, HARRIS M, PRINS J. Fast N-body simulation with CU- DA [ M ]. [ S. l. ] : Addison Wesley Professional, 2007:677- 696.
  • 4PODLOZHNYUK V, HARRIS M. Monte-Carlo option pricing[ EB/ OL]. (2007-11-21 ). http://www. nvidia. com/object/cuda_horne. html.
  • 5PODLOZHNYUK V. Black-scholes option pricing[ EB/OL]. (2007- 04-06). http://www. nvidia. com/object/euda_home. html.
  • 6DESCHIZEAUX B, BLANC J Y. Imaging earth' s subsurface using CUDA [ M ]. [ S. l. ] : Addison Wesley Professional, 2007:831 - 850.
  • 7HARISH P, NARAYANAN P J. Accelerating large graph algorithms on the GPU using CUDA[ C ]//Proc of IEEE International Conference on High Performance Computing. 2007 : 197- 208.
  • 8SHAMS R, BARNES N. Speeding up mutual information computation using NVIDIA CUDA hardware [ C ]//Proe of Digital Image Computing: Techniques and Applications. Adelaide, Australia: [ s. n. ], 2007:555- 560.
  • 9SHAMS R, KENNEDY R A. Efficient histogram algorithms for NVIDIA CUDA compatible devices [ C ]//Proc of International Conference on Signal Processing and Communications Systems, 2007: 418- 422.
  • 10HARRIS M. Optimizing parallel reduction in CUDA [ EB/OL]. (2007-11 ). http ://www. nvidia. com/object/cuda home. html.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部