期刊文献+

系统中浮点乘累加PE的设计与实现

Design and Implementation of Floating-Point Multiply-Accumulate Processing Element under SMVM System
下载PDF
导出
摘要 稀疏矩阵向量乘(Sparse Matrix-Vector Multiply,SMVM),形如Ab=x,在科学计算、信息检索、数据挖掘等领域中都是重要的计算核心之一。稀疏矩阵中非零元素的稀疏性,使得在微处理器上实现该类运算时,存在Cache缺失率高等问题,导致性能并不理想。针对该问题提出了基于FPGA实现SMVM运算系统的新思路,对系统功能进行了软硬件划分,并完成了系统中硬件浮点乘累加处理单元(ProcessingElement,PE)的设计与实现。目标器件为Virtex4LX60,工作频率达到123.6MHz。 Sparse Matrix-Vector Multiply,Ab=x,is one of the important kernels in scientific computatlon,text retrieval and data mining.The sparsity of non-zero elements in sparse matrix results in the high Cache miss ratio when implementing on micro-processors,so the performance is not ideal.This paper presents a novel architecture to realize SMVM system on FPGA ,the system functions are divided into software and hardware.This paper presents the design and implementation of floating point multiply accumulate processing element.The target device is Virtex4 LX60,and the working frequency is 123.6 MHz.
出处 《计算机工程与应用》 CSCD 北大核心 2006年第35期107-109,共3页 Computer Engineering and Applications
关键词 乘累加 浮点 稀疏矩阵向量乘 FPGA multiply-accumulate floating-point Sparse Matrix-Vector Multiply (SMVM) FPGA
  • 相关文献

参考文献8

  • 1IM E J.Optimizing the performance of sparse matrix-vector multiplication[D].Berkeley:Computer science,University of California,2000.
  • 2AMIRA.A high throughput FPGA implementation of a bit-level matrix-matrix product[C]//43rd IEEE Midwest Symposium on Circuits and Systems,2000.
  • 3SCROFANO.Energy efficiency of FPGAs and programmable processors for matrix multiplication.University of Southern California,2002.
  • 4ZHUO Ling,PRASANNA.Scalable and modular algorithms for floatingpoint matrix multiplication on FPGAs[C]//proceedings of the 18th International Parallel and Distributed Processing Symposium,2004.
  • 5DOU Yong,VASSILIADIS.64-bit floating-point FPGA Matrix Multiplication[C]//ACM FPGA'05,2005.
  • 6SAAD Y.SPARSKIT:a basic tool kit for sparse matrix computations[EB/OL].1994.http://www-users.cs.umn.edu/~saad/software/sparskit/.
  • 7LEISERSON C.Optimizing the synchronous circuitry by retiming[C]//Third Caltech Conference on VLSI,March 1993.
  • 8VUDUC.Performance optimizations and bounds for sparse matrix-vector multiply[C]//proceedings of IEEE/ACM Conference on Supercomputing,2002.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部