期刊文献+

二进制张量分解法简化神经网络推理计算

Simplifying inference computation of neural networks by identical-binary-tensor-factorization
下载PDF
导出
摘要 针对现有的简化神经网络推理计算方法面临模型精度下滑及重训练带来的额外开销问题,本文提出一种在比特级减少乘积累加运算(MAC)的乘加操作数的二进制张量分解法(IBTF)。该方法利用张量分解消除多个卷积核之间由于权值比特位重复导致的计算重复,并保持计算结果不变,即无需重训练。在比特级简化模型计算的IBTF算法与量化、稀疏等数据级简化方法正交,即可以协同使用,从而进一步减少MAC计算量。实验结果表明,在多个主流神经网络中,相较于量化与稀疏后的模型,IBTF进一步使计算量减少了3.32倍,并且IBTF在不同卷积核大小、不同权值位宽及不同稀疏率的卷积运算中都发挥了显著的效果。 Existing methods to simplify neural network inference often face the problem of model accuracy degradation and additional overhead caused by retraining. In this work, an identical binary tensor factorization(IBTF) algorithm is proposed for the further reduction of multiply-accumulate(MAC) operands under bit-level. IBTF uses tensor decomposition to extract the computation repetition between multiple convolution kernels due to the bit repetition of synapses, and keep computational results identical without retraining. Moreover, IBTF, which simplifies models under bit-level, is orthogonal to these data-level simplification methods such as quantization and sparsity, so they can be used synergistically to further reduce MAC operands. The experimental results show that, in several mainstream neural networks, compared with models after quantization and sparsity, IBTF further reduces 3. 32 times MAC operands. In addition, IBTF plays a significant role in convolution layers with different sizes, bit-widths and sparsity rates.
作者 郝一帆 杜子东 支天 HAO Yifan;DU Zidong;ZHI Tian(Intelligent Processor Research Center,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
出处 《高技术通讯》 CAS 2022年第7期687-695,共9页 Chinese High Technology Letters
基金 国家重点研发计划(2017YFB1003101,2018AAA0103300,2017YFA0700900) 国家自然科学基金(61532016,61732007)资助项目。
关键词 神经网络 二进制张量分解(IBTF) 乘积累加运算(MAC) neural network identical binary tensor factorization(IBTF) multiply-accumulate(MAC)
  • 相关文献

参考文献3

二级参考文献4

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部