深度学习背景下的图像语义分割方法综述被引量：5

Survey of image semantic segmentation methods in the deep learning era

导出

摘要语义分割任务是很多计算机视觉任务的前提与基础,在虚拟现实、无人驾驶等领域具有重要的应用价值。随着深度学习技术的快速发展,尤其是卷积神经网络(convolutional neural network,CNN)的出现,使得图像语义分割取得了长足的进步。首先,本文介绍了语义分割概念、相关背景和语义分割基本处理流程。然后,总结开源的2D、2.5D、3D数据集和其相适应的分割方法,详细描述了不同网络的分割特点、优缺点及分割精确度,得出监督学习是有效的训练方式。同时,介绍了权威的算法性能评价指标,根据不同方法的侧重点,对各个分割方法的相关实验进行了对比分析,指出了目前实验方面整体存在的问题,其中,DeepLab-V3+网络在分割精确度和速度方面都具有良好的性能,应用价值较高。在此基础上,本文针对国内外的研究现状,提出了当前面临的几点挑战和未来可能的研究方向。通过总结与分析,能够为相关研究人员进行图像语义分割相关研究提供参考。 Introduced by Ohta in 1980,image semantic segmentation assigns each pixel in an image with a pre-defined label that represents its semantic category.Aiming to understand the different scenes of images,image semantic segmenta⁃tion has received much research attention in the field of computer vision.In recent years,many research laboratories around the world have carried out research work on image semantic segmentation based on deep learning.Academic confer⁃ences in the fields of automation,artificial intelligence,and pattern recognition also reported research results on semantic segmentation.At the same time,semantic segmentation serves as the premise and basis of many computer vision tasks and has important application value in virtual reality,such as automatic driving and human-computer interaction.With the rapid development of deep learning technology,especially the emergence of convolutional neural networks,image semantic segmentation technology has made great progress and has far outperformed traditional methods in terms of accuracy and efficiency.First,this paper introduces the concept of semantic segmentation along with its background and basic process.In general,image semantic segmentation based on deep learning goes through three processing modules,namely,the feature extraction,semantic segmentation,and refinement processing modules.Second,this paper summarizes the open source 2D,RGB-D,and 3D datasets that have been used in recent years and their corresponding segmentation methods.The semantic segmentation methods for 2D data are divided into method based on candidate region,method based on fully supervised learning,and method based on weakly supervised learning.As RGB-D and 3D date,only a few semantic seg⁃mentation methods need to be classified,thus no further classification is performed.This paper describes in detail the net⁃work structure of several classical algorithms,the segmentation characteristics,advantages,and disadvantages of different networks,and their segmentation accuracy.Through this summary,this study reveals that most segmentation methods are based on fully supervised learning,which is an effective training method.Third,this paper introduces several authoritative performance evaluation indexes of algorithms,such as mean average precision(mAP)and mean intersection over union(mIoU),and tests the segmentation accuracy and computing performance of the semantic segmentation method when applied in 2D-data-related experiments.The Experimental section shows that the DeepLab-V3+network has good segmen⁃tation accuracy and speed,which attest to its high application value.The semantic segmentation performance for 2.5D and 3D data is also compared.The following key problems are highlighted in this section:some algorithms are not tested on authoritative datasets;some algorithms are not open source;and some experiments do not describe the relevant experimen⁃tal parameters in detail.Therefore,considering the current situation of research at home and abroad,this paper highlights several challenges and proposes some new directions for future research.First,segmentation algorithms tend to prioritize either accuracy or real time while ignoring the other.Second,a segmented network usually needs large amounts of memory to realize reasoning and training,hence making it unsuitable for some devices.Third,the design of the segmentation algo⁃rithm adapted to 3D data is a current research focus,but high-quality 3D datasets are generally lacking,and the existing 3D datasets are patchwork datasets.Fourth,only a few segmentation algorithms are available for RGB-D and 3D data(par⁃ticularly for 3D data),and open-source algorithms generally have low accuracy.Fifth,sequence data have temporal consis⁃tency.Sixth,some methods solve the problem of video or sequence segmentation,while others do not use time series infor⁃mation to improve accuracy or segmentation efficiency.Seventh,some papers have proposed that face detection can be real⁃ized without training deep neural network and examined whether semantic segmentation can be realized without a training network.Through summary and analysis,this paper hopes to provide some valuable reference for future research on image semantic segmentation.

作者严毅邓超李琳朱凌坤叶彪 Yan Yi;Deng Chao;Li Lin;Zhu Lingkun;Ye Biao(School of Automobile and Traffic Engineering,Wuhan University of Science and Technology,Wuhan 430063,China;School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430063,China;School of Transportation and Logistics Engineering,Wuhan University of Technology,Wuhan 430063,China)

机构地区武汉科技大学汽车与交通工程学院武汉科技大学计算机科学与技术学院武汉理工大学交通与物流工程学院

出处《中国图象图形学报》 CSCD 北大核心 2023年第11期3342-3362,共21页 Journal of Image and Graphics

基金国家自然科学基金青年基金项目(52002298) “运输车辆检测、诊断与维修技术”交通行业重点实验室开放课题(JTZL2205) 四川省无人系统智能感知控制技术工程实验室开放课题(WRXT2022-001) 云基物联网高速公路建养设备智能化实验室开放课题(KF_2022_301002)。

关键词深度学习图像语义分割(ISS) 卷积神经网络(CNN) 监督学习 DeepLab-V3+网络 deep learning image semantic segmentation(ISS) convolutional neural network(CNN) supervised learn⁃ing Deeplab-V3+network

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1段立娟,孙启超,乔元华,陈军成,崔国勤.基于注意力感知和语义感知的RGB-D室内图像语义分割算法[J].计算机学报,2021,44(2):275-291. 被引量：16

二级参考文献2

1田萱,王亮,丁琪.基于深度学习的图像语义分割方法综述[J].软件学报,2019,30(2):440-468. 被引量：226
2李宝奇,贺昱曜,何灵蛟,强伟.基于全卷积神经网络的非对称并行语义分割模型[J].电子学报,2019,47(5):1058-1064. 被引量：11

共引文献15

1黄扬林,胡凯,郭建强,彭诚.基于多尺度特征融合和双重注意力机制的肝脏CT图像分割[J].计算机科学,2022,49(S02):549-557. 被引量：2
2张兴国,周英迪,石新雨,罗霄月,顾杨旸.一种球机视频全景拼接及空间化方法[J].测绘科学,2022,47(5):203-211. 被引量：1
3郑斌军,孔玲君.基于DeepLabv3^(+)的图像语义分割优化方法[J].包装工程,2022,43(1):187-194. 被引量：3
4赵亮,张洁,陈志奎.基于双图正则化的自适应多模态鲁棒特征学习[J].计算机科学,2022,49(4):124-133. 被引量：2
5严良平,潘月梁,姜雄彪,陆秋雨,徐畅.深度图像引导的岩石颗粒分割方法[J].应用科技,2022,49(2):87-93. 被引量：2
6伏娜娜,许钢,陈玲,胡志锋,郑书展.基于通道特征融合的RGB-D图像语义分割方法[J].四川轻化工大学学报（自然科学版）,2022,35(4):42-48. 被引量：1
7孙启超,恩擎,段立娟,乔元华.基于多模态自适应卷积的RGB-D图像语义分割[J].计算机辅助设计与图形学学报,2022,34(8):1272-1282. 被引量：1
8佟强,刁恩虎,李丹,谌彤童,刘旭红,刘秀磊.分类任务中标签噪声的研究综述[J].科学技术与工程,2022,22(31):13626-13635. 被引量：2
9刘子健,张军,刘元盛,路铭,宋庆鹏.单级特征图融合坐标注意力的视觉位置识别方法[J].汽车技术,2023(3):19-25.
10王泽宇,布树辉,黄伟,郑远攀,吴庆岗,张旭.面向交通场景解析的局部和全局上下文注意力融合网络[J].计算机应用,2023,43(3):713-722. 被引量：1

同被引文献51

1邹林锋,吴志鸿,邓志勇,黄明炜,林进浔,陈国栋.图像清晰化处理技术在建筑工程的应用[J].电子技术（上海）,2020,49(4):33-35. 被引量：1
2王展青,陈顺云.基于单目视觉的车距测量方法综述[J].科技资讯,2010,8(27):33-36. 被引量：10
3江涛.遥感影像解译标志库的建立和应用[J].地理空间信息,2010,8(5):31-33. 被引量：28
4张巧芬,高健.机器视觉中照明技术的研究进展[J].照明工程学报,2011,22(2):31-37. 被引量：33
5郭娜,刘剑秋.植物生物量研究概述(综述)[J].亚热带植物科学,2011,40(2):83-88. 被引量：44
6刘茜,杨乐,柳钦火,李静.森林地上生物量遥感反演方法综述[J].遥感学报,2015,19(1):62-74. 被引量：83
7郑晓辉.基于双目立体测距的泵车臂架防碰撞算法[J].机械工程与自动化,2016(6):195-197. 被引量：2
8赵德安,吴任迪,刘晓洋,赵宇艳.基于YOLO深度卷积神经网络的复杂背景下机器人采摘苹果定位[J].农业工程学报,2019,35(3):164-173. 被引量：174
9赵苗苗,赵娜,刘羽,杨吉林,刘熠,岳天祥.森林碳计量方法研究进展[J].生态学报,2019,39(11):3797-3807. 被引量：21
10郭志懋,周傲英.数据质量和数据清洗研究综述[J].软件学报,2002,13(11):2076-2082. 被引量：268

引证文献5

1王淼,黄智忠,何晖光,卢湖川,单洪明,张军平.分割一切模型SAM的潜力与展望:综述[J].中国图象图形学报,2024,29(6):1479-1509.
2夏旺,许诗旋,童思奇.一种基于选权迭代的样本数据自动清洗方法[J].铁道勘察,2024,50(4):85-91.
3赵子琪,李丹丹,赵鼎,程志博,郭晓杰.基于深度学习的树冠分割及生物量估算[J].森林工程,2024,40(5):145-155.
4高鲲,张皓洋,李达,闫野,印二威.基于特征分离的复杂环境三维手部姿态估计算法研究[J].智能安全,2024,3(3):54-65.
5胡帅,鲁亚楠,王宇向,周诚.土方机械车载视觉智能感知研究与应用[J].施工技术（中英文）,2024,53(17):121-130.

1刘建军,罗欢.移动学习背景下虚拟仿真教学在电子类课程教学中实践应用[J].工业控制计算机,2023,36(12):165-166. 被引量：2
2顾涵.中华民族共同体视域下满族非物质文化遗产再认识[J].黑龙江民族丛刊,2023(4):145-149. 被引量：1
3肖永贺.职业教育视域下“中国故事”与英语课程耦合[J].辽宁高职学报,2023,25(12):44-47.
4本刊编辑部.关于论著前言部分和对象方法部分的撰写要求[J].中华眼外伤职业眼病杂志,2023,45(10):794-794.
5程利,温小琪.小学新生识字现状及教学策略初探——以重庆市沙坪坝区学府悦园第一小学为例[J].课堂内外（小学教研）,2023(11):26-28.
6丁晓丽.长沙市发展数字经济的人才保障机制研究[J].市场瞭望,2023(18):126-128.
7余自权,崔玉伟,杨海川,李萌娜,周瑞丰.网络攻击下无人机集群安全协同控制技术[J].海军航空大学学报,2023,38(6):457-465. 被引量：1
8王哲,陈耿彪.基于CNKI的国内“二语动机自我系统”文献分析研究[J].统计学与应用,2023,12(6):1728-1733.
9童峰,周跃海,陈东升,李姜辉,张小康.异构无人潜水器水声通信技术发展综述[J].哈尔滨工程大学学报,2023,44(11):1963-1976. 被引量：3
10张丹彤,许怡璇,黄惠根.医务人员离职意愿影响因素的质性研究[J].护理学,2023,12(6):1071-1077.

中国图象图形学报

2023年第11期

浏览历史

内容加载中请稍等...

深度学习背景下的图像语义分割方法综述被引量：5

参考文献1

二级参考文献2

共引文献15

同被引文献51

引证文献5

相关作者

相关机构

相关主题

浏览历史

深度学习背景下的图像语义分割方法综述 被引量：5

参考文献1

二级参考文献2

共引文献15

同被引文献51

引证文献5

相关作者

相关机构

相关主题

浏览历史

深度学习背景下的图像语义分割方法综述被引量：5