期刊文献+

一种简洁高效的加速卷积神经网络的方法 被引量:16

A Concise and Efficient Method for Accelerating Convolution Neural Networks
下载PDF
导出
摘要 卷积神经网络是机器学习领域一种广泛应用的方法,在深度学习中发挥着重要的作用。由于卷积神经网络一般需要多个层,而且训练数据通常都很大,所以网络训练可能需要几小时甚至很多天。目前虽然有一些利用GPU加速卷积神经网络训练的研究成果,但基本上都是实现方式复杂,需要技巧很高,而且容易出错。提出了一种简洁、高效的加速卷积神经网络训练的方法,其主要过程是将卷积层展开,这样卷积层和全连接层的主要训练步骤都可以用矩阵乘法表示;再利用BLAS库高效计算矩阵乘法。这种方法不需要过多考虑并行处理的细节和处理器的内核特点,在CPU和GPU上都能加速。实验证明,GPU上使用该方法比传统的CPU上的实现快了100多倍。 Convolutional neural networks( CNN) is a good methods in machine learning,it plays an important role in deep learning. As CNN usually have several layers and large training data,network training can take several hours or even several days. At present,although some research of GPU accelerated CNN training have been published,but their implementations are basically complex,with great skill,and prone to error. A concise and efficient method for accelerating CNN training is proposed. The main process is convolutional layer unrolling,this make the training process of convolutional layer and fully connected layer can be represented by matrix multiplication,and accelerated using BLAS libraries. This method does not require much consideration of the details of the processor and parallel processing,and can be accelerated on CPU and GPU. Experiments show that this method on the GPU performs 100 more times faster than traditional CPU implementation.
作者 刘进锋
出处 《科学技术与工程》 北大核心 2014年第33期240-244,共5页 Science Technology and Engineering
基金 宁夏自然科学基金(NZ12163)资助
关键词 卷积神经网络 卷积展开 矩阵乘法 CUDA BLAS convolutional neural networks convolution unrolling Matrix multiplication CUDA BLAS
  • 相关文献

同被引文献98

引证文献16

二级引证文献233

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部