基于众核智能网卡的低延迟图像识别系统设计被引量：3

Design of Low-latency Image Recognition System Based on Many-core SmartNIC

导出

摘要针对不断增长的数据密集型深度学习应用需求(如图像识别)在没有GPU、FPGA等专有设备支持下无法得到满足,阐述基于硬件卸载和数据流架构,提出一种处理模型,将深度学习计算任务卸载到具有众核结构的智能网卡上,使数据绕过CPU和操作系统内核实现在网计算。通过对智能网卡计算资源的划分和深度学习模型的分解,验证了深度学习模型在众核结构智能网卡这一低成本通用设备上的可迁移性,对AlexNet神经网络有效结构在Agilio网卡上的迁移,实现数据密集型深度学习应用的在网计算,借助流水线的设计思想来提高众核结构智能网卡上数据在网计算的吞吐性能和并行性。实验表明,图像数据在该系统中达到高吞吐性能的同时具有微秒级处理延迟。 In view of the growing demand for data intensive deep learning applications(such as image recognition) that cannot be met without the support of GPU, FPGA and other proprietary devices, this paper expounds the architecture based on hardware unloading and data flow, and proposes a processing model to unload the deep learning computing tasks to the intelligent network card with multi-core structure, so that the data can bypass the CPU and operating system kernel to realize online computing. By dividing the computing resources of the intelligent network card and decomposing the deep learning model, the portability of the deep learning model on the multi-core intelligent network card, which is a low-cost universal device, is verified, and the migration of the effective structure of the Alex Net neural network on the Agilio network card is realized to realize the on-line computing of the data intensive deep learning application, The pipeline design idea is used to improve the throughput and parallelism of data on the multi-core intelligent network card. The experiment shows that the image data has high throughput and microsecond processing delay.

作者沈硕邢凯 SHEN Shuo;XING Kai(School of Software Engineering,University of Science and Technology of China,Jiangsu 215123,China;School of Computer Science and Technology,University of Science and Technology of China,Anhui 230027,China)

机构地区中国科学技术大学软件学院中国科学技术大学计算机学院

出处《电子技术（上海）》 2022年第8期28-33,共6页 Electronic Technology

基金国家自然科学基金项目(NSFC 61332004)。

关键词智能网卡在网计算低延迟神经网络 intelligent network card in network computing low latency neural network

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献1

1覃雄派,王会举,杜小勇,王珊.大数据分析——RDBMS与MapReduce的竞争与共生[J].软件学报,2012,23(1):32-45. 被引量：386

二级参考文献82

1Zhou MQ, Zhang R, Zeng DD, Qian WN, Zhou AY. Join optimization in the MapReduce environment for column-wise data store. In: Fang YF, Huang ZX, eds. Proc. of the SKG. Ningbo: IEEE Computer Society, 2010.97-104. [doi: 10.1109/SKG.2010.18].
2Afrati FN, Ullman JD. Optimizing joins in a Map-Reduce environment. In: Manolescu I, Spaecapietra S, Teubner J, Kitsuregawa M, Leger A, Naumann F, Ailamaki A, Ozcan F, eds. Proc. of the EDBT. Lausanne: ACM Press, 2010. 99-110. [doi: 10.1145/ 1739041.1739056].
3Sandholm T, Lai K. MapReduce optimization using regulated dynamic prioritization. In: Douceur JR, Greenberg AG, Bonald T, Nieh J, eds. Proc. of the SIGMETRICS. Seattle: ACM Press, 2009. 299-310. [doi: 10.1145/1555349.1555384].
4Hoefler T, Lumsdaine A, Dongarra J. Towards; efficient MapReduce using MPI. In: Oster P, ed. Proc. of the EuroPVM/MPI. Berlin: Springer-Verlag, 2009. 240-249. [doi: 10.100'7/978-3-642-03770-2_30].
5Nykiel T, Potamias M, Mishra C, Kollios G, Koudas N. MRShare: Sharing across multiple queries in MapReduce. PVLDB, 2010, 3(1-2):494-505.
6Kambatla K, Rapolu N, Jagannathan S, Grama A. Asynchronous algorithms in MapReduce. In: Moreira JE, Matsuoka S, Pakin S, Cortes T, eds. Proc. of the CLUSTER. Crete: IEEE Press, 2010. 245-254. [doi: 10.1109/CLUSTER.2010.30].
7Polo J, Carrera D, Becerra Y, Torres J, Ayguad6 E, Steinder M, Whalley I. Performance-Driven task co-scheduling for MapReduce environments. In: Tonouchi T, Kim MS, eds. Proc. of the 1EEE Network Operations and Management Symp. (NOMS). Osaka: IEEE Press, 2010. 373-380. [doi: 10.1109/NOMS.2010.5488494].
8Zaharia M, Konwinski A, Joseph AD, Katz R, Stoica I. Improving MapReduce performance in heterogeneous environments. In: Draves R, van Renesse R, eds. Proc. of the ODSI. Berkeley: USENIX Association, 2008.29-42.
9Xie J, Yin S, Ruan XJ, Ding ZY, Tian Y, Majors J, Manzanares A, Qin X. Improving MapReduce performance through data placement in heterogeneous Hadoop clusters. In: Taufer M, Rfinger G, Du ZH, eds. Proc. of the Workshop on Heterogeneity in Computing (IPDPS 2010). Atlanta: IEEE Press, 2010. 1-9. [doi: 10.1109/IPDPSW.2010.5470880].
10Polo J, Carrera D, Becerra Y, Beltran V, Torres J, Ayguad6 E. Performance management of accelerated MapReduce workloads in heterogeneous clusters. In: Qin F, Barolli L, Cho SY, eds. Proc. of the ICPP. San Diego: IEEE Press, 2010. 653-662. [doi: 10.1109/ ICPP.2010.73].

共引文献385

1郑智泉,杨楠.智能革命下数据驱动的智慧图书馆建设分析[J].智能计算机与应用,2020(8):183-185.
2谢月锋,董现垒,陈卉,王燕,刘志成.利用网络痕迹信息即时预测儿童腹泻流行趋势[J].医学信息（医学与计算机应用）,2016,29(29):1-4.
3董新华,李瑞轩,周湾湾,王聪,薛正元,廖东杰.Hadoop系统性能优化与功能增强综述[J].计算机研究与发展,2013,50(S2):1-15. 被引量：70
4邓波,张玉超,金松昌,林旺群.基于MapReduce并行架构的大数据社会网络社团挖掘方法[J].计算机研究与发展,2013,50(S2):187-195. 被引量：10
5马宾.一种改进的并行K_近邻网络舆情分类算法研究[J].微电子学与计算机,2015,32(6):62-66. 被引量：1
6樊伟红,李晨晖,张兴旺,秦晓珠,郭自宽.图书馆需要怎样的“大数据”[J].图书馆杂志,2012,31(11):63-68. 被引量：238
7于薇.“大数据”背景下的信息处理技术分析与研究[J].数字图书馆论坛,2012(11):6-11. 被引量：3
8向剑平,乔少杰,胡剑.WMB*:一种提高大数据上软件执行效率改进算法[J].内江师范学院学报,2012,27(12):24-28. 被引量：4
9徐翔,邹复民,廖律超,朱铨.基于GemFire的海量数据计算性能实验分析[J].计算机应用,2013,33(1):226-229. 被引量：5
10黄晓斌,钟辉新.大数据时代企业竞争情报研究的创新与发展[J].图书与情报,2012(6):9-14. 被引量：120

同被引文献7

1张宏科,冯博昊,权伟.智融标识网络基础研究[J].电子学报,2019,47(5):977-982. 被引量：12
2高晓雷,张彬,王林惠,潘学文,段华斌,陈光辉,郭宜娟.基于改进VGG16的嵌入式图像识别系统设计[J].电脑知识与技术,2022,18(2):15-16. 被引量：6
3解冲锋,李星,李震,余勇志.大规模网络向IPv6单栈演进的技术方案[J].中兴通讯技术,2022,28(1):57-61. 被引量：4
4李雪芳.基于机器学习的计算机网络图像识别系统[J].信息技术与信息化,2022(8):206-209. 被引量：14
5章磊,段莉莉,索珈顺.基于边缘计算设备的手写数字图像识别系统[J].湖北理工学院学报,2022,38(5):20-24. 被引量：4
6陈坚,唐昌华.基于深度学习的车联网图像识别系统设计[J].信息与电脑,2022,34(20):1-3. 被引量：2
7赵晓,俸思洋,余康,韦棋升.基于EfficientNet模型的智能图像识别系统设计[J].信息记录材料,2023,24(2):147-150. 被引量：1

引证文献3

1雷波,马小婷,李聪,唐静,周舸帆.云网融合中的网络基础设施演进探讨[J].信息通信技术与政策,2022(11):8-17. 被引量：4
2李娜,张庆满.关于云网一体化技术基础架构的思考[J].数字技术与应用,2023,41(6):17-22.
3安世博,齐锦.基于视觉感知的低照度图像自动识别系统[J].信息与电脑,2023,35(8):189-191.

二级引证文献4

1苏永磊,崔健.基于云网融合的新型工业园区网络研究[J].新型工业化,2023,13(8):65-71. 被引量：1
2赵正波,王洁.面向算力的云网融合业务承载方式探讨[J].现代信息科技,2023,7(16):10-14.
3郭少勇,刘岩,邵苏杰,臧志斌,杨超,亓峰.新型电力系统数据跨域流通泛安全边界防护技术[J].电力系统自动化,2024,48(6):96-111. 被引量：6
4洪仕海,张俊,梁荣伟,何柏轩,孔令辉.城域云网固移融合业务组网方案探究[J].广东通信技术,2024,44(10):16-20.

1李长江,余海涛,李官星,李玥洋,谭雯文.基于SqueezeNet-Tiny的可回收垃圾智能垃圾桶设计及实现[J].现代信息科技,2022,6(17):75-77. 被引量：1
2YU Yuecheng,LIU Chang,WANG Chuan,SHI Jinlong.Thermal Infrared Salient Human Detection Model Combined with Thermal Features in Airport Terminal[J].Transactions of Nanjing University of Aeronautics and Astronautics,2022,39(4):434-449.
3杨欣,宁茜,李贴,孙肇文,刘浪,唐立军.药品泡罩包装缺陷智能检测系统[J].国外电子测量技术,2022,41(8):174-180. 被引量：1
4肖宁,许逵,陈沛龙,付渊,张历.基于大数据的电力测试设备自动化控制方法[J].自动化与仪器仪表,2022(10):124-128. 被引量：4
5崔建.基于OpenCV的ROS平台人脸识别系统研究[J].信息系统工程,2022,35(10):26-29. 被引量：2
6孟斌,秦昆,刘瑜.“社会感知与地理大数据挖掘”专刊导言[J].地球信息科学学报,2022,24(10):1853-1853.
7夏君(编译).隐形房子[J].英语画刊（高级）,2022(14):11-11.
8王雅东,赵丽娟,张美晨.采煤机自适应调高控制策略[J].煤炭学报,2022,47(9):3505-3522. 被引量：13
9邱静艺.职业成长视角下高职生职业核心能力培养教学实践探索[J].北京劳动保障职业学院学报,2022,16(3):42-46. 被引量：1
10黄立霞.政法类高职学校应用写作学习动机提升策略[J].新教育（海南）,2022(19):82-84.

电子技术（上海）

2022年第8期

浏览历史

内容加载中请稍等...

基于众核智能网卡的低延迟图像识别系统设计被引量：3

参考文献1

二级参考文献82

共引文献385

同被引文献7

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于众核智能网卡的低延迟图像识别系统设计 被引量：3

参考文献1

二级参考文献82

共引文献385

同被引文献7

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于众核智能网卡的低延迟图像识别系统设计被引量：3