Deep Model Compression for Mobile Platforms:A Survey 被引量：7

Deep Model Compression for Mobile Platforms:A Survey

导出

摘要 Despite the rapid development of mobile and embedded hardware, directly executing computationexpensive and storage-intensive deep learning algorithms on these devices’ local side remains constrained for sensory data analysis. In this paper, we first summarize the layer compression techniques for the state-of-theart deep learning model from three categories: weight factorization and pruning, convolution decomposition, and special layer architecture designing. For each category of layer compression techniques, we quantify their storage and computation tunable by layer compression techniques and discuss their practical challenges and possible improvements. Then, we implement Android projects using TensorFlow Mobile to test these 10 compression methods and compare their practical performances in terms of accuracy, parameter size, intermediate feature size,computation, processing latency, and energy consumption. To further discuss their advantages and bottlenecks,we test their performance over four standard recognition tasks on six resource-constrained Android smartphones.Finally, we survey two types of run-time Neural Network(NN) compression techniques which are orthogonal with the layer compression techniques, run-time resource management and cost optimization with special NN architecture,which are orthogonal with the layer compression techniques. Despite the rapid development of mobile and embedded hardware, directly executing computationexpensive and storage-intensive deep learning algorithms on these devices’ local side remains constrained for sensory data analysis. In this paper, we first summarize the layer compression techniques for the state-of-theart deep learning model from three categories: weight factorization and pruning, convolution decomposition, and special layer architecture designing. For each category of layer compression techniques, we quantify their storage and computation tunable by layer compression techniques and discuss their practical challenges and possible improvements. Then, we implement Android projects using TensorFlow Mobile to test these 10 compression methods and compare their practical performances in terms of accuracy, parameter size, intermediate feature size,computation, processing latency, and energy consumption. To further discuss their advantages and bottlenecks,we test their performance over four standard recognition tasks on six resource-constrained Android smartphones.Finally, we survey two types of run-time Neural Network(NN) compression techniques which are orthogonal with the layer compression techniques, run-time resource management and cost optimization with special NN architecture,which are orthogonal with the layer compression techniques.

作者 Kaiming Nan Sicong Liu Junzhao Du Hui Liu

机构地区 School of Computer Science and Technology School of Software and Institute of Software Engineering

出处《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2019年第6期677-693,共17页 清华大学学报（自然科学版（英文版）

基金 supported by the National Key Research and Development Program of China (No. 2018YFB1003605) Foundations of CARCH (No. CARCH201704) the National Natural Science Foundation of China (No. 61472312) Foundations of Shaanxi Province and Xi’an Science Technology Plan (Nos. B018230008 and BD34017020001) the Foundations of Xidian University (No. JBZ171002)

关键词 DEEP learning MODEL compression run-time RESOURCE management COST optimization deep learning model compression run-time resource management cost optimization

分类号 N [自然科学总论]

引文网络
相关文献

同被引文献32

1张星洲,鲁思迪,施巍松.边缘智能中的协同计算技术研究[J].人工智能,2019,0(5):55-67. 被引量：29
2Farid Ablayev,Marat Ablayev,Joshua Zhexue Huang,Kamil Khadiev,Nailya Salikhova,Dingming Wu.On Quantum Methods for Machine Learning Problems Part Ⅱ: Quantum Classification Algorithms[J].Big Data Mining and Analytics,2020,3(1):56-67. 被引量：1
3Zhenxing Guo,Shihua Zhang.Sparse Deep Nonnegative Matrix Factorization[J].Big Data Mining and Analytics,2020,3(1):13-28. 被引量：1
4豆海涛,黄宏伟,薛亚东.隧道衬砌渗漏水红外辐射特征影响因素试验研究[J].岩石力学与工程学报,2011,30(12):2426-2434. 被引量：28
5郭明玮,赵宇宙,项俊平,张陈斌,陈宗海.基于支持向量机的目标检测算法综述[J].控制与决策,2014,29(2):193-200. 被引量：114
6张朝昆,崔勇,唐翯翯,吴建平.软件定义网络(SDN)研究进展[J].软件学报,2015,26(1):62-81. 被引量：433
7钱金菊,王柯,王锐,彭向阳.变电站智能机器人巡检任务规划[J].广东电力,2017,30(2):143-149. 被引量：33
8张佳庆,范明豪,李伟,王刘芳,武海澄,汪书苹.电力电缆隧道机械通风防护技术研究[J].中国电力,2017,50(6):113-119. 被引量：8
9赵胜男,王文剑.融合SVM和快速均值漂移的图像分割算法[J].小型微型计算机系统,2017,38(7):1614-1618. 被引量：4
10纪荣嵘,林绍辉,晁飞,吴永坚,黄飞跃.深度神经网络压缩与加速综述[J].计算机研究与发展,2018,55(9):1871-1888. 被引量：54

引证文献7

1雷霆,谢榕昌,黄滔,钟力强,王柯,杨跞,樊韪铖.基于SSD改进算法的电缆隧道积水识别方法[J].广东电力,2019,32(9):131-136. 被引量：4
2尹文枫,梁玲燕,彭慧民,曹其春,赵健,董刚,赵雅倩,赵坤.卷积神经网络压缩与加速技术研究进展[J].计算机系统应用,2020,29(9):16-25. 被引量：8
3曹建芳,田晓东,贾一鸣,闫敏敏.改进DeepLabV3+模型在壁画分割中的应用[J].计算机应用,2021,41(5):1471-1476. 被引量：4
4Xiaoge Deng,Tao Sun,Feng Liu,Dongsheng Li.SIGNGD with Error Feedback Meets Lazily Aggregated Technique:Communication-Efficient Algorithms for Distributed Learning[J].Tsinghua Science and Technology,2022,27(1):174-185.
5吴恋,赵晨洁,韦萍萍,于国龙,徐勇.基于轻量级深度网络的计算机病毒检测方法[J].计算机工程与设计,2022,43(3):632-638. 被引量：3
6江恺,曹越,周欢,任学锋,朱永东,林海.车联网边缘智能:概念、架构、问题、实施和展望[J].物联网学报,2023,7(1):37-48. 被引量：1
7杨会渠,杨国为,何金钟,徐健.支持全整数推断的神经网络递增定点量化算法研究[J].青岛大学学报（工程技术版）,2023,38(2):10-17.

二级引证文献20

1陈嘉钰.智慧档案馆数据化管理功能的实现[J].档案管理,2021(1):57-58. 被引量：9
2马壮,杨威.边缘计算驱动的对话机器人终端部署[J].软件工程,2021,24(2):19-23. 被引量：3
3田佳鹭,邓立国.基于改进VGG-16神经网络的图像分类方法[J].计算技术与自动化,2021,40(2):131-135. 被引量：10
4张伟彬,吴军,易见兵.基于RFB网络的特征融合管制物品检测算法研究[J].广西师范大学学报（自然科学版）,2021,39(4):34-46. 被引量：6
5张有波,郭威,周悦,徐高飞,李广伟,孙洪鸣.基于多粒度剪枝的水下遗迹实时目标检测[J].激光与光电子学进展,2021,58(14):278-287. 被引量：7
6李波,胡超,周波,黎皓彬.基于物联网与大数据技术的电缆识别技术研究[J].信息技术,2021,45(11):162-167. 被引量：2
7李新海,徐宝军,范德和,曾令诚,肖星,邱天怡,袁拓来.变电站设备状态智能识别系统技术研究[J].电气传动,2021,51(24):33-39. 被引量：16
8曹建芳,田晓东,贾一鸣,闫敏敏,马尚.基于改进PSPNet网络的古代壁画分割方法[J].河南师范大学学报（自然科学版）,2022,50(4):65-75. 被引量：4
9郝一帆,杜子东,支天.二进制张量分解法简化神经网络推理计算[J].高技术通讯,2022,32(7):687-695.
10曹挚.基于深度学习的高校学生成绩预测与评价方法研究[J].信息记录材料,2022,23(11):123-125. 被引量：1

1Feroze Kaliyadan.Teledermatology update:Mobile teledermatology[J].World Journal of Dermatology,2013,2(2):11-15.
2Hong Wei BI,Hui HE.A Tree-valued Markov Process Associated with an Admissible Family of Branching Mechanisms[J].Acta Mathematica Sinica,English Series,2019,35(1):135-160.
3Jyoti Ranjan Parida,Durga Prasanna Misra,Anupam Wakhlu,Vikas Agarwal.Is non-biological treatment of rheumatoid arthritis as good as biologics?[J].World Journal of Orthopedics,2015,6(2):278-283. 被引量：3
4章辉,吕沅宏,薛立嘉.一种基于地理信息系统的异构网络多目标算法[J].电讯技术,2019,59(1):40-45. 被引量：2
5刘晓霞,李芳.云环境中期限分割下工作流调度代价优化仿真[J].实验室研究与探索,2018,37(10):136-141. 被引量：1
6张嘉仪.The Influence of the Smartphones on Modern Life[J].中学生英语,2019(6):128-128.
7Ajit Singh.Implementing Augmented Reality in Learning[J].Psychology Research,2019,9(4):172-177.
8LiangPang,Xiao Chen,Zhi Xue,Rida Khatoun.A Novel Range-Free Jammer Localization Solution in Wireless Network by Using PSO Algorithm[J].国际计算机前沿大会会议论文集,2017(2):46-48.
9吴风霖,刘思远,杨文力,范桁.Construction of Complete Orthogonal Genuine Multipartite Entanglement State[J].Chinese Physics Letters,2019,36(6):1-5.
10Fei Gao,SuJuan Qin,Wei Huang,QiaoYan Wen.Quantum private query: A new kind of practical quantum cryptographic protocol[J].Science China(Physics,Mechanics & Astronomy),2019,62(7):10-21. 被引量：8

Tsinghua Science and Technology

2019年第6期

浏览历史

内容加载中请稍等...

Deep Model Compression for Mobile Platforms:A Survey 被引量：7

同被引文献32

引证文献7

二级引证文献20

相关作者

相关机构

相关主题

浏览历史