期刊文献+

自适应调控卷积与双路信息嵌入的城市街景实例分割

Traffic Street Scene Instance Segmentation Based on Adaptive Regulatory Convolution and Dual-Path Information Embedding
下载PDF
导出
摘要 城市街道场景实例分割是无人驾驶不可忽略的关键技术之一,针对城市街景实例密集、边缘模糊以及背景干扰严重等问题,提出一种自适应调控卷积与双路信息嵌入的城市街景实例分割模型RENet.首先使用自适应调控卷积替代原有的残差结构,利用可变形卷积学习空间采样位置偏移量,提高模型对图像复杂形变的建模能力,同时对多分支结构进行通道混洗以加强不同通道间的信息流动,并应用注意力机制实现通道权重的自适应校准,提高模型对复杂场景下模糊、密集目标的分割精度;然后设计低维空间信息嵌入分支,对不同尺度特征图进行空间信息激励与重编码,在抽象语义特征中嵌入低维空间信息,提高模型轮廓分割准确性;最后引入高级语义信息嵌入模块,实现特征图与语义框的对齐,弥补特征图间语义与分辨率的差距,提高不同尺度下特征信息融合的有效性.在自建数据集上的实验结果表明,与原始YOLACT网络模型相比,RENet模型在复杂街道背景下的平均分割精度最高达到51.6%,提高了10.4个百分点;网络推理速度达到17.5帧/s,验证了该模型的有效性和在工程中的实用性. Instance-level segmentation of street scene is a key technology that cannot be ignored in un-manned driving.Aiming at the problems of dense instances in urban street scene,blurred edges and serious background interference,a segmentation model RENet based on adaptive regulation convolution and dual-path information embedding was proposed.Firstly,adaptive regulatory convolution was used to replace the original residual structure,and deformable convolution learning space sampling position offset was used to improve the modeling ability of the model for complex image deformation.At the same time,channel mixing was carried out on the multi-branch structure to enhance the information flow between different channels,and the attention mechanism was applied to realize the adaptive calibration of channel weights.Improve the segmentation accuracy of the model for fuzzy and dense objects in complex scenes.Secondly,low dimensional spatial information embedding branches were designed,spatial information excitation and recoding were carried out on different scale feature maps,and low dimensional spatial information was em-bedded into abstract semantic features to improve the accuracy of model contour segmentation.Finally,a high-level semantic information embedding module is introduced to align the feature map with the semantic box,bridge the semantic and resolution gap between the feature maps,and improve the effectiveness of fea-ture information fusion at different scales.The test results on the self-built data set show that compared with the original YOLACT network model,the average segmentation accuracy of RENet under the complex street background is up to 51.6%,which is 10.4 percentage point higher.At the same time,the network reasoning speed reaches 17.5 frames/s,which verifies the effectiveness of the model optimization and the practicability of the engineering value.
作者 何自芬 黄俊璇 张印辉 朱守业 He Zifen;Huang Junxuan;Zhang Yinhui;Zhu Shouye(School of Mechanical and Electrical Engineering,Kunming University of Science and Technology,Kunming 650500)
出处 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2023年第7期1086-1096,共11页 Journal of Computer-Aided Design & Computer Graphics
基金 国家自然科学基金(62171206,62061022)。
关键词 密集实例 街景分割 自适应调控卷积 复杂形变建模 双路信息嵌入 dense instances street view segmentation adaptive control convolution complex deformation model-ing dual-path information embedding
  • 相关文献

参考文献7

二级参考文献51

  • 1MarkoffJ. How many computers to identify a cat?[NJ The New York Times, 2012-06-25.
  • 2MarkoffJ. Scientists see promise in deep-learning programs[NJ. The New York Times, 2012-11-23.
  • 3李彦宏.2012百度年会主题报告:相信技术的力量[R].北京:百度,2013.
  • 410 Breakthrough Technologies 2013[N]. MIT Technology Review, 2013-04-23.
  • 5Rumelhart D, Hinton G, Williams R. Learning representations by back-propagating errors[J]. Nature. 1986, 323(6088): 533-536.
  • 6Hinton G, Salakhutdinov R. Reducing the dimensionality of data with neural networks[J]. Science. 2006, 313(504). Doi: 10. 1l26/science. 1127647.
  • 7Dahl G. Yu Dong, Deng u, et a1. Context-dependent pre?trained deep neural networks for large vocabulary speech recognition[J]. IEEE Trans on Audio, Speech, and Language Processing. 2012, 20 (1): 30-42.
  • 8Jaitly N. Nguyen P, Nguyen A, et a1. Application of pretrained deep neural networks to large vocabulary speech recognition[CJ //Proc of Interspeech , Grenoble, France: International Speech Communication Association, 2012.
  • 9LeCun y, Boser B, DenkerJ S. et a1. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, I: 541-551.
  • 10Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)[OLJ.[2013-08-01J. http://www. image?net.org/challenges/LSVRC/2012/.

共引文献687

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部