期刊文献+

基于GPU的受限玻尔兹曼机并行加速 被引量:1

Realization of RBM parallel training based on GPU
下载PDF
导出
摘要 为针对受限玻尔兹曼机处理大数据时存在的训练缓慢、难以得到模型最优的问题,提出了基于GPU的RBM模型训练并行加速方法。首先重新规划了对比散度算法在GPU的实现步骤;其次结合以往GPU并行方案,提出采用CUBLAS执行训练的矩阵乘加运算,设计周期更长、代码更为简洁的Tausworthe113和CLCG4的组合随机数生成器,利用CUDA拾取纹理内存的读取模式实现了Sigmoid函数值计算;最后对训练时间和效果进行检验。通过MNIST手写数字识别集实验证明,相较于以往RBM并行代码,新设计的GPU并行方案在处理大规模数据集训练上优势较为明显,加速比达到25以上。 In order to overcome the low efficiency of Restricted Boltzmann Machine handle large data, on the basic of parallel computing platform GPU, the realization of RBM training based on GPU is designed. By researching the training steps of RBM, the contrast divergence algorithm is redesigned to implement on GPU. Combined with previous GPU parallel solutions,matrix multiply-add operations are implemented by CUBLAS libraries. The combination of Tausworthe113 and CLCG4 is used as random number generation to get longer cycle and more concise random number. The CUDA pickup texture memory read mode is used to achieve sigmoid function value, and finally The MNIST handwriting digit database is conduct on the test of this new realization. The MNIST experiment results illustrated that the novel algorithm has good feasibility and is advantageous for hug amount of data. Compared to the previous RBM parallel code, this new GPU parallel processing have more obvious advantages on large data sets and the speedup rate reach at least 25.
出处 《电子设计工程》 2016年第2期28-31,34,共5页 Electronic Design Engineering
基金 国家自然科学基金(61032001)
关键词 受限玻尔兹曼机 GPU CUDA 加速比 并行加速 RBM GPU CUDA speed-up parallel programming
  • 相关文献

参考文献14

  • 1Becker S. An information-theoretic unsupervised learning algorithm for neural networks[D]. University of Toronto, 1992.
  • 2Srivastava N,Salakhutdinov R R,Hinton G E. Modeling documents with deep bohzmannmachines [C]//29th Confer- ence on Uncertainty in Artificial Intelligence. Bellevue: IEEE, 2013:222-227.
  • 3Daniel L. Ly,Paprotski V,Danny Y. Neural Networks on GPUs: Restricted Bohzmann Machines [C]//IEEE Conference on Machine Learning and Applications. Bellevue: IEEE, 2010:307-312.
  • 4Cai X,Xu Z,Lai G,et al. GPU-accelerated restricted boltzmann machine for collaborative filtering [M]. Algorithms and Architectures for Parallel Processing. Springer Berlin Heidelberg, 2012:303-316.
  • 5Lopes N,Ribeiro B,Goncalves J. Restricted Boltzmann machines and deep belief networks on multi-core processors [C]//Neural Networks (IJCNN), NewYork: Spring,2012:1-7.
  • 6薛少飞,宋彦,戴礼荣.基于多GPU的深层神经网络快速训练方法[J].清华大学学报(自然科学版),2013,53(6):745-748. 被引量:4
  • 7Welling M,Rosen-Zvi M,Hinton G E. Exponential family harmoniums with an application to information retrieval[C]. Advances in neural information processing systems,2004: 1481-1488.
  • 8Le Roux N, Bengio Y. Representational power of restrictedBohzmann machines and deep belief networks Computation, 2008, 20(6): 1631-1649.
  • 9G.E.Hinton. Training products of experts by minimizing contrastive divergence [J]. Neural Computation, 2002, 14(8): 1711-1800.
  • 10王坤.基于GPU的分类并行算法的研究与实现[J].电子设计工程,2014,22(18):39-41. 被引量:3

二级参考文献50

共引文献13

同被引文献11

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部