摘要
为针对受限玻尔兹曼机处理大数据时存在的训练缓慢、难以得到模型最优的问题,提出了基于GPU的RBM模型训练并行加速方法。首先重新规划了对比散度算法在GPU的实现步骤;其次结合以往GPU并行方案,提出采用CUBLAS执行训练的矩阵乘加运算,设计周期更长、代码更为简洁的Tausworthe113和CLCG4的组合随机数生成器,利用CUDA拾取纹理内存的读取模式实现了Sigmoid函数值计算;最后对训练时间和效果进行检验。通过MNIST手写数字识别集实验证明,相较于以往RBM并行代码,新设计的GPU并行方案在处理大规模数据集训练上优势较为明显,加速比达到25以上。
In order to overcome the low efficiency of Restricted Boltzmann Machine handle large data, on the basic of parallel computing platform GPU, the realization of RBM training based on GPU is designed. By researching the training steps of RBM, the contrast divergence algorithm is redesigned to implement on GPU. Combined with previous GPU parallel solutions,matrix multiply-add operations are implemented by CUBLAS libraries. The combination of Tausworthe113 and CLCG4 is used as random number generation to get longer cycle and more concise random number. The CUDA pickup texture memory read mode is used to achieve sigmoid function value, and finally The MNIST handwriting digit database is conduct on the test of this new realization. The MNIST experiment results illustrated that the novel algorithm has good feasibility and is advantageous for hug amount of data. Compared to the previous RBM parallel code, this new GPU parallel processing have more obvious advantages on large data sets and the speedup rate reach at least 25.
出处
《电子设计工程》
2016年第2期28-31,34,共5页
Electronic Design Engineering
基金
国家自然科学基金(61032001)