摘要
Nowadays,the most heterogeneous architectures were made up by the various IP modules of different hardware vendors,but this model is less efficiently.In order to solve this problem,AMD joint other hardware vendors proposed heterogeneous system architecture(HSA)specification.On the one hand,the HSA could help developers to accelerate the design process and programming.On the other hand,it improved the system performance and reduced the power.In this paper we presented the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks(CNNs)on the HSA,on the basis of implementation,we presented tow accelerated methods that are Online update weights and letting CPU to participate in calculation.Experimental results showed that the implementation of CNNs on HSA 4 to 10 times faster than on the CPU.
出处
《国际计算机前沿大会会议论文集》
2017年第1期150-152,共3页
International Conference of Pioneering Computer Scientists, Engineers and Educators(ICPCSEE)