期刊文献+

基于多核DSP的矢量高效QR分解技术

High Efficient QR Decomposition by Vector Technology on Multi-core DSP
下载PDF
导出
摘要 以多核数字信号处理器(Digital Signal Processor,DSP)作为计算节点的多核DSP集群系统成为一大发展趋势。当前阶段,由于多核DSP内核硬件资源利用不充分与访存带宽限制,峰值性能与实际性能间存在鸿沟。基于C66x内核丰富的指令集架构以及运算指令编排原则,结合编译器提供的汇编信息,设计并优化了QR分解算法,在充分挖掘DSP单核性能极致的同时减少了矩阵分解的计算时间。根据掌握的优化技术,设计并实现基于多核DSP集群系统的大规模并行QR分解模型,并在分布式计算框架上完成了分解任务。分析结果表明,优化后的QR分解计算效率以及C66x单核硬件资源使用率均提升了二十余倍,随着待分解矩阵规模的成倍增加,多核DSP集群相比于单核的计算性能提升也愈加明显。 The multi-core digital signal processor(DSP)cluster system with multi-core DSP as the computing node has become a major development trend.At the current stage,due to the insufficient utilization of multi-core DSP hardware computing resources and memory access bandwidth limitations,there is gap between peak performance and actual performance.According to the rich instruction set architecture of the C66x core and the principle of operation instruction arrangement,as well as the assembly information provided by the compiler,this paper designs and optimizes the QR decomposition algorithm,which fully exploits the extreme performance of the DSP single core and reduces the calculation time of matrix decomposition.Then,according to the optimization technology,a large-scale parallel QR decomposition model based on multi-core DSP cluster system is designed and implemented,and the decomposition task is completed on the distributed computing frame.The analysis results show that the optimized QR decomposition computation efficiency and the C66x single-core hardware resource utilization rate are both increased by more than 20 times.As the size of the matrix to be decomposed increases exponentially,the multi-core DSP cluster’s computing performance has improved more significantly compared with that of single-core.
作者 张宇帆 陈颖 方科 费霞 ZHANG Yufan;CHEN Ying;FANG Ke;FEI Xia(Southwest China Institute of Electronic Technology,Chengdu 610036,China;Sichuan Key Laboratory of Agile Intelligent Computing,Chengdu 610036,China)
出处 《电讯技术》 北大核心 2023年第4期536-543,共8页 Telecommunication Engineering
关键词 多核数字信号处理器(DSP) QR分解 软件优化 分布式计算 multi-core digital signal processor(DSP) QR decomposition software optimization distributed computing
  • 相关文献

参考文献2

二级参考文献6

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部