摘要
图卷积神经网络GCN当前主要在PyTorch等深度学习框架上基于GPU实现加速。然而GCN的运算过程包含多层嵌套的矩阵乘法和数据访存操作,使用GPU虽然可以满足实时性需求,但是部署代价大、能效比低。为了提高GCN算法的计算性能并保持软件灵活性,提出一种基于RSIC-V SoC的定制GCN加速器,在蜂鸟E203的SoC平台中通过点积运算扩展指令和硬件加速器软硬件协同的方法实现了针对GCN的加速,通过神经网络参数分析确定了从浮点数到32位定点数的硬件量化方案。实验结果表明,在Cora数据集上运行GCN算法时,该加速器没有精度损失,速度最高提高了6.88倍。
Graph Convolutional Networks(GCN),an algorithm for processing non-Euclidean data,is currently mainly implemented on deep learning frameworks such as PyTorch for GPU acceleration.GCN's computation process involves nested matrix multiplication and data access operations,which can be satisfied by GPU in real-time but have high deployment costs and low energy efficiency.To improve the computational performance of GCN algorithm while maintaining software flexibility,this paper proposes a custom GCN accelerator based on RSIC-V SoC,which extends the dot product operation and hardware accelerator through hardware-software co-design in the hummingbird E203 SoC platform.The neural network parameter analysis determines the hardware quantization scheme from floating point to 32-bit fixed point.Experimental results show that the proposed accelerator has no accuracy loss and can achieve a maximum speedup of 6.88 times when running GCN algorithm on Cora dataset.
作者
周理
赵祉乔
潘国腾
铁俊波
赵王
ZHOU Li;ZHAO Zhi-qiao;PAN Guo-teng;TIE Jun-bo;ZHAO Wang(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
出处
《计算机工程与科学》
CSCD
北大核心
2023年第12期2113-2120,共8页
Computer Engineering & Science
关键词
RISC-V
图卷积神经网络
硬件加速器
指令集
RISC-V
graph convolutional neural network
hardware accelerator
instruction set