期刊文献+

一种龙芯平台上多媒体指令优化时地址非对齐问题的解决方案

Solution to Address Misalignment in Multimedia Instructions Optimization on Loongson Platform
下载PDF
导出
摘要 在龙芯平台多媒体指令优化过程中,通常用浮点存取指令存取需并行计算的整数.若这些整数存放在非自然对齐的内存地址上,会导致优化函数的性能显著下降.为了保证优化函数在访问非对齐数据时也有同样的性能,本文采用龙芯通用指令中的非对齐存取指令实现多媒体指令对非对齐数据的存取需求.非对齐存取指令是成对使用的,两条非对齐存取指令的处理时长大概是单条浮点存取指令的五倍左右,故需要合理安排非对齐存取指令的使用.基于此,本文先设计了龙芯平台上64位的非对齐访存函数接口,同时保留现有访存接口;然后设计接口自适应择优算法,用以根据程序上下文灵活选取这些访存接口;最后对LibYUV库的优化函数应用接口自适应择优算法进行测试.结果表明,在数据非对齐时,多媒体指令优化函数出现性能提升比例较小甚至普遍下降的情况;而使用接口自适应择优算法后,所有优化函数平均保持近40%的性能提升比例. During the optimization of the multimedia instructions in the Loongson platform,floating-point access instructions are usually used to access integers that need to be calculated in parallel.If these integers are stored in memory addresses that are not naturally aligned,the performance of the optimization function will be significantly reduced.In order to ensure that the optimized function also has the same performance when accessing non-aligned data,this article uses the non-aligned access instruction in the Loongson universal instruction to complete the multimedia instruction’s access to non-aligned data.Non-aligned access instructions are used in pairs and the processing time of two non-aligned access instructions is about five times larger than a single floating-point access instruction.Therefore,it is required to use access instructions reasonably.Based on that,this article designed the 64-bit non-aligned memory access function interface on the Loongson platform firstly and retaining the existing memory access interface;then designed the interface adaptive optimization algorithm to flexibly select these memory access interfaces according to the program context;and finally the optimization function of the LibYUV library is tested by using the adaptive optimization algorithm of the interface.The results show that w hen the data is not aligned,the performance improvement ratio of the multimedia instruction optimization function is small or even generally decreased;and after the interface adaptive optimization algorithm is used,all optimization functions maintain an average performance improvement ratio of nearly 40%.
作者 李正平 程洋洋 LI Zheng-ping;CHENG Yang-yang(Institute of Electronic Information Engineering,Anhui University,Hefei 230039,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2021年第1期60-63,共4页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(40000009)资助 安徽省自然科学基金项目(10000007)资助 教育部新世纪优秀人才支持计划项目(NCET-00-0001)资助。
关键词 多媒体指令 非对齐数据 访存接口 接口自适应择优算法 multimedia instructions non-aligned data memory access interface interface adaptive optimization algorithm
  • 相关文献

参考文献3

二级参考文献21

  • 1孟小甫,高翔,从明,张爽爽.龙芯3A多核处理器系统级性能优化与分析[J].计算机研究与发展,2012,49(S1):137-142. 被引量:12
  • 2邹琼,董峻峰.针对龙芯2号结构特征的GCC优化[J].小型微型计算机系统,2007,28(12):2272-2276. 被引量:1
  • 3Diefendorff K,Dubey P K, Hochsprung R, et al. AltiVecExtension to PowerPC Accelerates Media Processing I J I- IEEE Micro,2000,20(2) :85-95.
  • 4Boggs D,Baktha A,Hawkins J,et al. The Microarchitecture of the Intel Pentium 4 Processor on 90 nm Technology I Jl- Intel Technology Journal ,2004,8( 1 ) :7-23.
  • 5Singh J P, Gupta A, Ohara M, et al. The SPLASH-2 Programs: Characterization and Methodological Consider- ationsl C l//Proceedings of the 22nd Annual International Symposium on Computer Architecture. New York, USA: ACM Press, 1995:24-36.
  • 6Sweetman D. See MIPS Run[ M ]. San Francisco, USA : Morgan Kaufmann Publishers Inc. ,2006.
  • 7Sites R L. Alpha Architecture Reference Manual [ M ]. I S. 1. ] : Digital Press, 1992.
  • 8Nuzman D, Henderson R. Multi-platform Auto-vectori- zation [ C l//Proceedings of International Symposium on Code Generation and Optimization. Washington D. C., USA :IEEE Computer Society ,2006:281-294.
  • 9Eichenberger A E,Wu Peng,O' Brien K. Vectorization for SIMD Architectures with Alignment Constraints ~ C3// Proceedings of ACM SIGPLAN Conference on Programm- ing Languages Design and Implementation. New York, USA :ACM Press ,2004:82-93.
  • 10Wu Peng, Eichenberger A E, Wang A. Efficient SIMD Code Generation for Runtime Alignment and Length Con- version[Cl//Proceedings of the International Symposium on Code Generation and Optimization. Washington D. C., USA : IEEE Computer Society ,2005 : 153-164.

共引文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部