期刊文献+

基于CUDA的BP算法并行化与实例验证

Parallelization of BP algorithm and example verification based on CUDA
下载PDF
导出
摘要 CUDA是应用较广的GPU通用计算模型,BP算法是目前应用最广泛的神经网络模型之一。提出了用CUDA模型并行化BP算法的方法。用该方法训练BP神经网络,训练开始前将数据传到GPU,训练开始后计算隐含层和输出层的输入输出和误差,更新权重和偏倚的过程都在GPU上实现。将该方法用于手写数字图片训练练实验,与在四核CPU上的训练相比,加速比为6.12~8.17。分别用在CPU和GPU上训练得到的结果识别相同的测试集图片,GPU上的训练结果对图片的识别率比CPU上的高0.05%~0.22%。 CUDA is a generally used GPGPU (General Purpose Computing on GPU) model. BP algorithm is one of the most widely used neural network model at present. A method of parallelizing BP algorithm using CUDA is proposed in this paper. When this method are used to train BP neural network, data are transferred to GPU before training. Process of computing inputs, outputs, errors of hidden layer and output layer and updating weights, biases are realized on GPU. Training handwritten digital images with this method has speed-up ratio between 6.12 and 8.17 compared to training on four cores CPU. When this two results are respectively used to recognize the same test set, the recognition rate based on training result on GPU increases 0.05% 0.22% compared to that of CPU.
出处 《计算机工程与应用》 CSCD 2013年第23期31-34,51,共5页 Computer Engineering and Applications
基金 计算机体系结构国家重点实验室开放课题资助(No.CARCH201105)
关键词 向后传播算法 并行化 计算统一设备架构 手写数字训练 Back-Propagation (BP) algorithm parallelization Compute United Device Architecture (CUDA) handwrittendigits training
  • 相关文献

参考文献12

  • 1冯百明,洪远麟,廉继昌.MIMD系统上成批训练BP算法程序的并行划分[J].模式识别与人工智能,1998,11(1):107-111. 被引量:1
  • 2Su K, Jung K.GPU implementation of neural networks[J]. Elsevier, 2004,37(6) : 1311-1314.
  • 3田绪红,江敏杰.GPU加速的神经网络BP算法[J].计算机应用研究,2009,26(5):1679-1681. 被引量:6
  • 4Scanzio S, Cumani S, Gemello R, et al.Parallel implementa- tion of artificial neural network training[C]//IEEE Interna- tional Conference on Acoustics Speech and Signal Process-ing, Dallas, TX, 2010 : 4902-4905.
  • 5Scanzio S, Cumani S,Gemello R,et al.ParaUel implementa- tion of artificial neural network training for speech recogni- tion[J].Elsevier, 2010,3 ( 11 ) : 1302-1309.
  • 6Honghoon J,Anjin P,Keechul J.Neural network implementa- tion using CUDA and OpenMP[C]//Digital Image Comput- ing: Techniques and Application, Canberra, ACT, 2008 : 155-161.
  • 7Lin Jinian,Lin Jianghong.Accelerating BP neural network- based image compression by CPU and GPU cooperation[C]// IEEE International Conference on Multimedia Technology, 2010: 1-4.
  • 8张舒,褚艳利.GPU高性能计算之CUDA[M].北京:中国水利水电出版社.200910:213.
  • 9厉旭杰.GPU加速的图像匹配技术[J].计算机工程与应用,2012,48(2):173-176. 被引量:12
  • 10HanJiawei,KamberM.数据挖掘概念与技术[M].2版.范明,孟小峰,泽.北京:机械工业出版社,2011:212.219.

二级参考文献22

  • 1吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量:141
  • 2阳方林,杨风暴,韦全芳,韩焱.一种新的快速图像匹配算法[J].计算机工程与应用,2005,41(5):51-52. 被引量:13
  • 3张庆丹,戴正华,冯圣中,孙凝晖.基于GPU的串匹配算法研究[J].计算机应用,2006,26(7):1735-1737. 被引量:15
  • 4李建明,万单领,迟忠先,胡祥培.一种基于GPU加速的细粒度并行粒子群算法[J].哈尔滨工业大学学报,2006,38(12):2162-2166. 被引量:8
  • 5Pharr M.GPU精粹2[M].龚敏敏,译.北京:清华大学出版社,2007:201-219.
  • 6MAGOULAS G D, VRAHATIS M N, ANDEROULAKIS G S. Effective back-propagation training with variable stepwise [ J ]. Neural Networks, 1997,10( 1 ) :69-82.
  • 7YU Xiao-hu, CHEN Guo-an. Efficient back-propagation learning using optimal learning rate and momentum [ J ]. Neural Networks, 1997,10(3) :517-527.
  • 8MARTIN F M. A scaled conjugate gradient algorithm for fast supervised learning[ J ]. Neural Networks, 1993,6(3 ) :525-533.
  • 9JEFF B, LAN F. Sparse matrix solvers on the GPU : conjugate gradients and multigrid[J]. ACM Trans on Graphics, 2003, 22(3) : 917-924.
  • 10HILLESLAND K, MOLNOV S, GRZEDSZCZUK R. Nonlinear optimization framework for image-based modeling on programmable gra-phics hardware[J]. ACM Trans on Graphics, 2003,22(3) :925-934.

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部