期刊文献+

基于时序融合的自动驾驶多任务感知算法 被引量:3

Multi-task perception algorithm of autonomous driving based on temporal fusion
原文传递
导出
摘要 采用连续图像帧作为输入,挖掘连续图像帧之间的时序关联信息,构建一种融合时序信息的多任务联合驾驶环境视觉感知算法,通过多任务监督联合优化,实现交通参与目标的快速检测,同时获取可通行区域信息;采用ResNet50作为骨干网络,在骨干网络中构建级联特征融合模块,捕捉不同图像帧之间的非局部远程依赖关系,将高分辨率图像通过卷积下采样处理,加速不同图像帧的特征提取过程,平衡算法的精度和速度;在不同的图像帧中,为了消除由于物体运动产生的空间位移对特征融合的影响,且考虑不同图像帧的非局部关联信息,构建时序特征融合模块分别对不同图像帧对应的特征图进行时序对齐与匹配,形成融合全局特征;基于共享参数的骨干网络,利用生成关键点热图的方法对道路中的行人、车辆和交通信号灯的位置进行检测,并利用语义分割子网络为自动驾驶汽车提供道路可行驶区域信息。研究结果表明:提出的感知算法以多帧图像代替单一帧图像作为输入,利用了多帧图像的序列特性,级联特征融合模块通过下采样使得计算复杂度降低为原来的1/16,与CornerNet、ICNet等其他主流模型相比,算法检测精确率平均提升了6%,分割性能平均提升了5%,并保持了每秒12帧图像的处理速度,在检测与分割速度和精度上具有明显优势。 The sequential image frames were used as input to mine the temporal associated information among the continuous image frames,and a multi-task joint driving environment perception algorithm fusing the temporal information was constructed to rapidly detect the traffic participation targets and drivable area through multi-task supervision and joint optimization.ResNet50 was used as the backbone network,in which a cascaded feature fusion module was built to capture the non-local remote dependence among different image frames.The high-resolution images were processed by the convolution subsampling to accelerate the feature extraction process of different image frames,balancing the detection accuracy and speed of the algorithm.In order to eliminate the influence of spatial displacements of the objects among the image frames on the feature fusion,and considering the non-local dependence of the features of different image frames,the temporal feature fusion module was constructed to align and match the time sequences of feature maps corresponding to different image frames for forming the integrated global feature.Based on the parameter-sharing backbone network,the heat map of generating key point was exploited to detect the positions of pedestrians,vehicles and traffic signal lights on the road,and the semantic segmentation sub-network was built to provide the drivable area information for autonomous vehicles on the road.Analysis results show that the proposed algorithm takes sequential frames as input instead of single frame,which makes effective use of the temporal characteristics of the frames.In addition,its computational complexity with the cascaded feature fusion module greatly reduces to sixteenth of that without the cascaded feature fusion module through downsampling.Compared with other mainstream models,such as CornerNet and ICNet,the detection accuracy and segmentation performance of the algorithm improve by an average of 6%and 5%,respectively,and the image processing speed reaches to 12 frames per second.Therefore,the proposed algorithm has obvious advantages in the speed and accuracy of image detection and segmentation.6 tabs,9 figs,31 refs.
作者 刘占文 范颂华 齐明远 董鸣 王品 赵祥模 LIU Zhan-wen;FAN Song-hua;QI Ming-yuan;DONG Ming;WANG Pin;ZHAO Xiang-mo(School of Information Engineering,Chang’an University,Xi’an 710064,Shaanxi,China;University of California,Berkeley,Berkeley 94804-4648,California,USA)
出处 《交通运输工程学报》 EI CSCD 北大核心 2021年第4期223-234,共12页 Journal of Traffic and Transportation Engineering
基金 国家自然科学基金项目(U1864204) 国家重点研发计划项目(2019YFB1600103) 陕西省重点研发计划项目(2018ZDXM-GY-044)。
关键词 交通信息工程 环境感知 时序融合 目标检测 语义分割 traffic information engineering environmental perception temporal fusion object detection semantic segmentation
  • 相关文献

同被引文献15

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部