基于四阶段人工优化的软件流水技术被引量：1

Software Pipelining Technique Based on Four-Phase Manual Optimization

下载PDF

导出

摘要代码体积是优化存储资源有限的嵌入式系统的重要因素之一。针对该特点,使用oprofile性能分析工具,以EEMBC基准程序集作为工作负载,提出四阶段人工优化软件流水方法(FPMO)。电信类的自相关程序实验结果表明,FPMO以2.04%的代码增量为代价换来40.678%的性能提升,而单纯的编译器自动优化则以33.35%的体积膨胀换来38.33%的性能提升。 For embedded systems with very limited memory resources, code size becomes one of the most important optimization concerns. Using the oprofile profiling tool, this paper focuses on the Four-Phase Manual Optimization（FPMO） for the software pipelining technique when running the EEMBC benchmark. Experimental result of telecom-autocorrelation shows the FPMO method gets 40.678% performance promotion by increasing 2.04% code size but the pure compiler automatic optimization trades 38.33% performance improvements by 33.35% code size expansion.

作者周国建吴少刚李祖松史岗

机构地区中国石油大学华东计算机与通信工程学院中国石油大学(华东)计算机与通信工程学院中国科学院计算技术研究所微处理器中心

出处《计算机工程》 CAS CSCD 北大核心 2009年第5期40-43,共4页 Computer Engineering

基金国家“863”计划基金资助项目(2007AA01Z114) 国家“863”计划基金资助重点项目“低成本先进计算机单机”(2006AA010201) 国家自然科学基金资助项目(60703017)

关键词软件流水循环展开性能分析四阶段人工优化 software pipelining loop unrolling performance analysis Four-Phase Manual Optimization（FPMO）

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献4

1Zhuge Qingfeng, Xiao Bin. Code Size Reduction Technique and Implementation for Software-pipelined DSP Applications[J]. ACM Trans. on Embedded Computing Systems, 2003, 2(4): 590-613.
2Sanchez F, Cortadella J, Badia R M. Optimal Exploration of the Unrolling Degree for Software Pipelining[J]. Journal of Systems Architecture: the EUROMICRO Journal, 1999, 45(6): 1-16.
3Sarkar V. Optimized Unrolling of Nested Loops[C]//Proc. of the 14th International Conference on Supercomputing. New Mexico, USA: [s. n.], 2000.
4李文龙,刘利,汤志忠.软件流水中的循环展开优化[J].北京航空航天大学学报,2004,30(11):1111-1115. 被引量：16

二级参考文献9

1Allen V H, Jones R B, Lee R M, et al . Software pipelining [J]. ACM Computing Surveys, 1995,27(3):367-432
2Weiss S, Smith J E. A study of scalar compilation techniques for pipelined supercomputers[J]. ACM Transactions on Mathematical Software, 1990, 16(3):223-245
3Rau B R. Iterative modulo scheduling[R]. HPL-94-115, 1994
4Mowry T C, Lam M S, Gupta A. Design and evaluation of a compiler algorithm for prefetching[A]. In: Proceeding of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems [C]. Massachusetts: ACM Press, 199
5Roy J, Sun C, Wu C Y. Open research compiler for itanium processor family(IPF) [A]. In:MICRO-34 Tutorial [C]. Texas: ACM Press, 2001
6Intel Corporation. Intel IA-64 architecture software developer's manual. Volume 3: Instruction set reference [M]. Intel Corp, 2000
7Intel Corporation. Intel IA-64 architecture software developer's manual. Volume 1: IA-64 application architecture [M]. Intel Corp, 2000
8Sanchez F, Cortadella J,Badia R M. Optimal exploration of the unrolling degree for software pipelining [R]. UPC-DAC-1996-41, 1996
9Vivek Sarkar. Optimized unrolling of nested loops [A]. In: Proceedings of the 14th International Conference on Supercomputing[C]. New Mexico: ACM Press, 2000. 153-166

共引文献15

1伍仲祥,孙名松.浅析嵌入式系统编程中的代码优化[J].自动化技术与应用,2005,24(12):18-21. 被引量：5
2吴俊军,刘东升.S3FC9DC单片机代码优化技术研究[J].微计算机信息,2007(03Z):88-90. 被引量：1
3刘利,陈彧,乔林,汤志忠.利用循环分割和循环展开避免Cache代价[J].软件学报,2008,19(9):2228-2242. 被引量：2
4郭淑婷.DSP汇编语言优化设计[J].河南师范大学学报（自然科学版）,2009,37(1):151-154. 被引量：2
5严历,郭力.三维宏观拟颗粒模拟程序计算代码优化研究与实现[J].计算机与应用化学,2009,26(12):1523-1528.
6马晓静.一种雷达信号处理机的软件设计[J].雷达与对抗,2012,32(1):61-64. 被引量：2
7陈纪孝,李勇.软件流水循环缓冲的设计与实现[J].计算机科学,2013,40(4):35-37. 被引量：4
8谢小西,赵彦琦,荀宇畅.指令级并行中的循环展开和指令调度[J].小作家选刊（教学交流）,2013(7):141-141.
9刘刚,张恒,张滇,毛睿.基于龙芯3B处理器的Linpack优化实现[J].深圳大学学报（理工版）,2014,31(3):286-292. 被引量：3
10袁仕继,刘志华,黄文晶,张广吉.一种面向多核处理器高效并行的Montgomery加密算法[J].太赫兹科学与电子信息学报,2014,12(3):397-401. 被引量：1

同被引文献2

1牟海维,张华.基于DSP的G.729优化方法研究[J].微计算机应用,2009,30(5):36-40. 被引量：3
2ZHOU Xin MIAO Chang-yun LI Yan-feng WU Zhi-gang.Optimization and Realization of G.729 Speech Coding Algorithm[J].Semiconductor Photonics and Technology,2009,15(2):111-116. 被引量：1

引证文献1

1李娟娟,俞一彪,芮贤义.G.729A语音压缩算法的多级优化[J].计算机工程,2011,37(7):291-292. 被引量：3

二级引证文献3

1伍技祥,仲元昌,李彩玲,李飞.嵌入式数字语音处理实验系统的设计与实现[J].实验室研究与探索,2012,31(10):46-49. 被引量：4
2陈德宏,林加龙,胡兴柳.基于TMS320C55X的G.729语音压缩算法全汇编优化[J].安徽工业大学学报（自然科学版）,2013,30(4):435-439. 被引量：1
3李娟娟,俞一彪,芮贤义.结合牛顿-拉夫森函数计算语音线谱对参数的高效算法[J].信号处理,2014,30(12):1479-1485. 被引量：1

1赵家森.通过人工优化加速程序的运行[J].计算机时代,2002(10):45-46.
2苏伯珙,王剑,吴益民,汤志忠.采用两级软件流水技术的VLIW优化编译器[J].计算机学报,1992,15(7):491-498.
3愚人.给Excel工作簿“减肥”[J].软件指南,2007(2):35-37.
4Ambiq Micro宣布世界最低功耗微控制器现已批量生产[J].中国集成电路,2015,24(12):9-9.
5凡启飞,张戈,徐翠萍.嵌入式处理器中的寄存器堆延迟写回技术[J].计算机辅助设计与图形学学报,2009,21(8):1182-1188. 被引量：1
6吴佩华,郭勇,漆锋滨.软件流水技术在gcc上的实现[J].高性能计算技术,2004,0(4):1-5.
7汤志忠,王雷,钱江.多重循环的软件流水技术[J].软件学报,1996,7(7):422-427. 被引量：1
8新闻[J].电子设计应用,2003(11):89-90.
9EEMBC简介[J].今日电子,2007(10):77-77.
10张玉正.Android系统性能调优工具介绍[J].程序员,2013(6):90-93.

计算机工程

2009年第5期

浏览历史

内容加载中请稍等...

基于四阶段人工优化的软件流水技术被引量：1

参考文献4

二级参考文献9

共引文献15

同被引文献2

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于四阶段人工优化的软件流水技术 被引量：1

参考文献4

二级参考文献9

共引文献15

同被引文献2

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于四阶段人工优化的软件流水技术被引量：1