摘要
随着人工智能的发展,深度神经网络成为多种模式识别任务中必不可少的工具,由于深度卷积神经网络(CNN)参数量巨大、计算复杂度高,将它部署到计算资源和存储空间受限的边缘计算设备上成为一项挑战。因此,深度网络压缩成为近年来的研究热点。低秩分解与向量量化是深度网络压缩中重要的两个研究分支,其核心思想都是通过找到原网络结构的一种紧凑型表达,从而降低网络参数的冗余程度。通过建立联合压缩框架,提出一种基于低秩分解和向量量化的深度网络压缩方法——可量化的张量分解(QTD)。该方法能够在网络低秩结构的基础上实现进一步的量化,从而得到更大的压缩比。在CIFAR-10数据集上对经典ResNet和该方法进行验证的实验结果表明,QTD能够在准确率仅损失1.71个百分点的情况下,将网络参数量压缩至原来的1%。而在大型数据集ImageNet上把所提方法与基于量化的方法PQF(Permute,Quantize,and Fine-tune)、基于低秩分解的方法TDNR(Tucker Decomposition with Nonlinear Response)和基于剪枝的方法CLIP-Q(Compression Learning by In-parallel Pruning-Quantization)进行比较与分析的实验结果表明,QTD能够在相同压缩范围下实现更好的分类准确率。
As the development of artificial intelligence,deep neural network has become an essential tool in various pattern recognition tasks.Deploying deep Convolutional Neural Networks(CNN)on edge computing equipment is challenging due to storage space and computing resource constraints.Therefore,deep network compression has become an important research topic in recent years.Low-rank decomposition and vector quantization are the most popular network compression techniques,which both try to find a compact representation of the original network,thereby reducing the redundancy of network parameters.By establishing a joint compression framework,a deep network compression method based on low-rank decomposition and vector decomposition—Quantized Tensor Decomposition(QTD)was proposed to obtain higher compression ratio by performing further quantization based on the low-rank structure of network.Experimental results of classical ResNet and the proposed method on CIFAR-10 dataset show that the volume can be compressed to 1%by QTD with a slight accuracy drop of 1.71 percentage points.Moreover,the proposed method was compared with the quantization-based method PQF(Permute,Quantize,and Fine-tune),the low-rank decomposition-based method TDNR(Tucker Decomposition with Nonlinear Response),and the pruning-based method CLIP-Q(Compression Learning by In-parallel Pruning-Quantization)on large dataset ImageNet.Experimental results show that QTD can maintain better classification accuracy with same compression range.
作者
王东炜
刘柏辰
韩志
王艳美
唐延东
WANG Dongwei;LIU Baichen;HAN Zhi;WANG Yanmei;TANG Yandong(State Key Laboratory of Robotics(Shenyang Institute of Automation,Chinese Academy of Sciences),Shenyang Liaoning 110016,China;Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang Liaoning 110016,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《计算机应用》
CSCD
北大核心
2024年第7期1987-1994,共8页
journal of Computer Applications
基金
国家重点研发计划项目(2020YFB1313400)。
关键词
卷积神经网络
张量分解
向量量化
模型压缩
图像分类
Convolutional Neural Network(CNN)
tensor decomposition
vector quantization
model compression
image classification