期刊文献+

面向交通场景解析的局部和全局上下文注意力融合网络 被引量:1

Local and global context attentive fusion network for traffic scene parsing
下载PDF
导出
摘要 为解决交通场景解析中局部和全局上下文信息自适应聚合的问题,提出3模块架构的局部和全局上下文注意力融合网络(LGCAFN)。前端的特征提取模块由基于串联空洞空间金字塔池化(CASPP)单元改进的ResNet-101组成,能够更加有效地提取物体的多尺度局部特征;中端的结构化学习模块由8路长短期记忆(LSTM)网络分支组成,可以更加准确地推理物体邻近8个不同方向上场景区域的空间结构化特征;后端的特征融合模块采用基于注意力机制的3阶段融合方式,能够自适应地聚合有用的上下文信息并屏蔽噪声上下文信息,且生成的多模态融合特征能够更加全面且准确地表示物体的语义信息。在Cityscapes标准和扩展数据集上的实验结果表明,相较于逆变换网络(ITN)和对象上下文表示网络(OCRN)等方法,LGCAFN实现了最优的平均交并比(mIoU),达到了84.0%和86.3%,表明LGCAFN能够准确地解析交通场景,有助于实现车辆自动驾驶。 In order to solve the local and global contextual information adaptive aggregation problem in traffic scene parsing,a Local and Global Context Attentive Fusion Network(LGCAFN)with three-module architecture was proposed.The front-end feature extraction module consisted of the improved 101-layer Residual Network(ResNet-101)which was based on Cascaded Atrous Spatial Pyramid Pooling(CASPP)unit,and was able to extract object’s multi-scale local features more effectively.The mid-end structural learning module was composed of eight Long Short-Term Memory(LSTM)branches,and was able to infer spatial structural features of object’s adjacent scene regions in eight different directions more accurately.In the back-end feature fusion module,a three-stage fusion method based on attention mechanism was adopted to adaptively aggregate useful contextual information and shield from noisy contextual information,and the generated multimodal fusion features were able to represent object’s semantic information in a more comprehensive and accurate way.Experimental results on Cityscapes standard and extended datasets demonstrate that compared to the existing state-of-the-art methods such as Inverse Transformation Network(ITN),and Object Contextual Representation Network(OCRN),LGCAFN achieves the best mean Intersection over Union(mIoU),reaching 84.0%and 86.3%respectively,showing that LGCAFN can parse traffic scenes accurately and is helpful to realize autonomous driving of vehicles.
作者 王泽宇 布树辉 黄伟 郑远攀 吴庆岗 张旭 WANG Zeyu;BU Shuhui;HUANG Wei;ZHENG Yuanpan;WU Qinggang;ZHANG Xu(College of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou Henan 450002,China;School of Aeronautics,Northwestern Polytechnical University,Xi’an Shaanxi 710072,China)
出处 《计算机应用》 CSCD 北大核心 2023年第3期713-722,共10页 journal of Computer Applications
基金 河南省科技攻关项目(222102210021) 河南省高等学校重点科研项目计划支持(21A520049)。
关键词 交通场景解析 自适应聚合 串联空洞空间金字塔池化 长短期记忆 注意力融合 traffic scene parsing adaptive aggregation Cascaded Atrous Spatial Pyramid Pooling(CASPP) Long Short-Term Memory(LSTM) attentive fusion
  • 相关文献

参考文献5

二级参考文献20

共引文献36

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部