期刊文献+

排序算法在龙芯3A上的优化实现

Optimized Implementation of Sorting Algorithm on Loongson 3A
下载PDF
导出
摘要 分析归并排序算法和快速排序算法,根据国产CPU龙芯3A的体系结构特性,提出2种优化算法并进行实现。综合利用访存特性,引入拷贝优化、循环展开、交换操作优化和不同基本排序混用等优化技术。测试结果表明,在不影响排序稳定性的前提下,与Glibc 2.11库中的排序函数相比,2种优化算法均能提升16.9%~90.5%的排序性能。 Through analysis of the characteristics of merging sorting and quick sorting,this paper presents two optimized algorithms specially for Loongson 3A CPU's features.To boost the performance,it adapts various optimizing techniques including utilize the characteristic of memory access,copy optimization,loop unrolling,operation exchanging and compounding different basic sorting algorithms.Without losing the robustness,sorting performance on Loongson 3A is improved by about 16.9% to 90.5% in different situations,compared to the sorting function in Glibc 2.11.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第20期255-257,270,共4页 Computer Engineering
基金 国家"863"计划基金资助项目(2008AA010902) 国家自然科学基金资助项目(60803066)
关键词 龙芯3A 归并排序 快速排序 优化算法 循环展开 Loongson 3A merging sorting quick sorting optimization algorithm loop unrolling
  • 相关文献

参考文献7

  • 1Xiao Li, Zhang Xiaodong, Kubricht S A. Improving Memory Performance of Sorting Algorithms[J]. ACM Journal on Experimental Algorithmics, 2000, 5(3).
  • 2Wickremesinghe R, Arge L, Chase J, et al. Efficient Sorting Using Registers and Caches[J]. ACM Journal of Experimental Algorithmics, 2002, 7(9).
  • 3明玉瑞,李思泽.基于SIMD机制的并行排序算法[J].计算机系统应用,2009,18(11):87-90. 被引量:4
  • 4中国科学院计算技术研究所. 龙芯3A处理器用户手册(01版)[Z]. 2009.
  • 5Sedgewick R. Implementing Quicksort Programs[J]. Communi- cations of the ACM, 1978, 21(10): 847-857.
  • 6马胜,戴葵,黄立波,王志英.OpenVG算法在SDTA上的优化实现[J].计算机工程,2010,36(4):236-238. 被引量:1
  • 7Knuth D E. The Art of Computer Programming, Vol.3: Sorting and Searching[M]. 2nd ed. [S. l.]: Addison-Wesley, 1998.

二级参考文献3

  • 1Khronos Group. OpenVG Specification Version 1.0[Z], (2005-08- 01). http://www.khronos.org/openvg/.
  • 2Gan Xinbiao, Dai Kui, Huang Libo. A New CORDIC Algorithm and Software Implementation Based on Synchronized Data Triggering Architecture[C]//Proc. of the 2008 International Conference on Multimedia and Ubiquitous Engineering. Busan, Korea: [s. n.], 2008: 83-86.
  • 3Hei Gaoqi, Bai Baogang, Pan Zhigeng. Accelerated Rendering of Vector Graphics on Mobile Devices[C]//Proc. of the 12th International Conference on Human-computer Interaction. Beijing, China: [s. n.], 2007: 298-305.

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部