C870流处理器上的大型矩阵计算方法

A Large Matrix Calculation Method on C870 Stream Processor

下载PDF

导出

摘要 C870流处理器采用三级存储层次、三级访问模式,其流处理结构特别适合于数据并行性好、全局数据重用较少的计算密集型应用。根据C870流处理器的软硬件结构,针对高度的浮点密集运算、海量数据元素并行计算的问题,本文提出使用计算来隐藏内存访问的延迟,从而提高存储系统的带宽。并首次提出了在C870流处理器上的使用芯片上共享内存(On-chip Shared Memory)的大型矩阵的计算方法,并用5000*5000和2000*2000的方形矩阵进行优化实验,实验结果证明了使用芯片上共享内存优化计算,可以使浮点性能提高7倍多。 C870 stream processor uses three storage levels, three access patterns, the stream structure particularly suited to data parallelism and the overall data reused less compute-intensive applications. The solutions for highly floating point-intensive computing and a large number of data elements parallel computing problems, memory access can use the delay calculation to hide, so as to enhance the bandwidth of system storage. According to the C870 stream processor hardware and software structure, the paper described on the C870 stream processor to use on-chip shared memory to calculate a large matrix, and to use experimental data to prove that the use of the on-chip shared memory of C870 stream processor, can effectively increase the bandwidth of system storage and improve the efficiency of parallel computing.

作者贾丹陈庆奎

机构地区上海理工大学计算机与电气工程学院

出处《微计算机信息》北大核心 2008年第24期303-305,共3页 Control & Automation

关键词 C870流处理器矩阵计算芯片上共享内存 C870 Stream Processor Matrix Calculation On-chip Shared Memory

分类号 TP302.7 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献9

1Khailany B, DallyWJ, KapasiUJ, etal.Imagine: Media Processing with Streams[J]. IEEE Micro, 2001,21(2):35-46.
2Khailany B.The VLSI Implementation and Evaluation of Area and Energy Efficient Stream Media Processor:[ PhD Thesis ][D] . Deptartment of Electrical Engineering,Stanford University,2003.
3Ahn J H ,Dally W J,Khailany B ,etal.Evaluating the Imagine Stream Arehiteeture [A].Proc of the 31st Annual Int'l Syrup on Computer Architecture[C].2004.
4NVIDIA Corporation.NVIDIA CUDA Compute Unified Device Architecture Programming Guide [EB/OL]. [2007-12]. http://www. nvidia.com/object/cuda_get.html
5Victor Podlozhnyuk.Binomial option pricing model [EB/OL]. [2007-04]. http://www.nvidia.com/object/cuda_get.html
6Christian Lessig.Eigenvalue Computation with CUDA [EB/OL]. [2007-10]. http://www.nvidia.com/object/cuda_get.html
7Ignacio Casta?o. High Quality DXT Compression using CUDA [EB/OL].[2007-02]. http://www.nvidia.com/obj ect/cuda_get.html
8Mark Harris.Parallel Prefix Sum (Scan) with CUDA[EB/OL]. [2007-04]. http://www.nvidia.com/obj ect/cuda_get.html
9湛邵斌,陈圣波,揣媛媛,刘海博,轩义华.网格平台构建技术研究[J].微计算机信息,2008,24(6):124-126. 被引量：11

二级参考文献17

1彭斌,袁亚兵.网格计算及其应用发展研究[J].微计算机信息,2006,22(10X):22-24. 被引量：4
2[3]Foster I,Kesselman C.The grid:Blueprint for a new computing in frastructure[M].San Francisco,CA:Morgan Kaufmann Publishers,1998.
3[5]G.Y.Cai,Y.Xue,J.K.Tang,etc,2004.Experience of Remote Sensing Information Modelling with Gridcomputing.Lecture Notes in Computer Science,3039:1003-1010,2004.
4[6]Wang,Y.G.,Xue,Y.,Jianqian Wang,etci,2005,Java-based Grid Service Spread and Implementation in Remote Sensing Applications.Lecture Notes in Computer Science,Vol.3516,pp.496-503.
5[7]Chercenak A,Foster I,Kesselman C,et al.The Data Grid:Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets.Journal of Network and Computer Applications,2001:187-200.http://www.eu-datagrid.org/
6[9]Romberg,Mathilde.The UNICORE Grid infrastructure.Scientific programming,2002,10(2)s.149 Special Issue on Grid Computing
7[10]http://www.dutchgrid.nl
8[11]http://www.osi.ie/pdf/irish_grid.pdf
9[12]Foster I.What is the grid? A three-point checklist[J].Grid Today-Daily News and Information For the Global Grid Community,2002,1(6):32-36.
10[14]http://www.ubuntu.com

共引文献10

1张丽芬,代君.网格技术在三维影视动画中的应用[J].电影评介,2008(21):67-68.
2韦小波.网格技术在企业信息化建设中的应用[J].科协论坛（下半月）,2008(10):79-80.
3牛江川,刘俊杰,李素娟.基于仿真网格的引信虚拟试验研究[J].微计算机信息,2009,25(10):203-205. 被引量：1
4宫鼎.网格计算技术分析及展望[J].福建电脑,2009,25(10):44-45.
5黄鑫,秦勃,刘培顺.基于OGSA-DAI的海洋环境数据共享网格[J].微计算机信息,2010,26(3):45-46. 被引量：1
6桑莉莉.工作流系统适应性检查点机制的研究[J].计算机应用与软件,2010,27(3):139-141.
7李晓堂,湛邵斌,谢忠时.一种CDMA EV-DO revA网络性能优化方案的设计[J].科学技术与工程,2010,10(6):1407-1409.
8霍红颖.综合信息平台关键技术研究[J].微计算机信息,2011,27(7):152-153. 被引量：1
9杨逸文,周迪.烟草移动服务GIS简易网格监控系统[J].计算机系统应用,2012,21(2):142-145. 被引量：1
10李刚.基于B/S结构的企业信息平台[J].中国科技信息,2012(10):173-173.

1刘华勇,钱江.基于点的曲线曲面有限元光顺方法[J].组合机床与自动化加工技术,2005(11):12-16.
2杨柳,赵蕾.GPU在通用计算技术领域的研究[J].科技与企业,2012(20):123-123. 被引量：1
3陈倩,吴春亮.一种基于JMS和XML的大型矩阵分布式计算系统的设计方案[J].现代计算机,2010,16(1):145-147.
4刘华勇.用遗传算法光顺基于点表示的曲线曲面[J].机械科学与技术,2006,25(7):826-830. 被引量：1
5刘华勇,钱江.曲线曲面的有限元光顺方法[J].黑龙江水专学报,2005,32(4):68-72.
6刘晶.基于PVM的并行计算[J].广东石油化工学院学报,2012,22(4):34-35. 被引量：3
7杜飞龙.开辟企业计算新路——刀片式服务器正当时[J].微电脑世界,2003(21):99-101.
8李俊照,罗家融.基于linux集群的并行计算[J].计算机测量与控制,2004,12(11):1064-1066. 被引量：14
9Sean Fan.先进服务器应用的内存解决方案[J].电子与电脑,2007(7):81-84.
10曾大军.云平台下大型矩阵乘法运算处理方案设计[J].科技广场,2012(5):42-44. 被引量：3

微计算机信息

2008年第24期

浏览历史

内容加载中请稍等...

C870流处理器上的大型矩阵计算方法

参考文献9

二级参考文献17

共引文献10

相关作者

相关机构

相关主题

浏览历史