摘要
在GODSON-3B八核处理器平台上,对矩阵乘法算法进行了优化和评估,针对矩阵乘法中A,B,C三个矩阵各自的访存特点,采用不同的方法对其访存行为进行优化,隐藏访存时间,使矩阵乘法性能达到122Gflops,效率为95.3%.
Based on the GODSON-3B &core processor, an optimized implementation and evaluation of matrix multiplication was proposed. For the memory access characteristic of each matrix in matrix multiplication, different methods were used to optimize the memory access behavior, hiding memory access time. The performance of optimized matrix multiplication achieves 122 Gflops, and an efficiency of 95.3 %.
基金
国家自然科学基金(60736012
60921002)
国家重点基础研究发展(973)计划(2005CB321600)
中国高技术研究发展(863)计划(2008AA110901)资助
关键词
多核
向量扩展
寄存器堆
矩阵乘法
multi-core
vector expansion
register file
matrix multiplication