摘要
视频目标分割是计算机视觉中的一项关键任务,在自动驾驶、视频编码等领域具有重要意义。针对视频目标分割任务,提出使用一种高效的编码记忆网络(EMNet)实现半监督视频目标分割任务。该方法包含自适应参考帧选取模块、双路径匹配模块、特征处理模块以及特征聚合模块。自适应参考帧选取模块综合考虑掩码置信度和相似度,选择包含丰富信息的参考帧。双路径匹配模块实现查询帧和参考帧之间的双向和双尺度匹配,提高目标特征匹配准确率。特征处理模块分别包含语义强化模块和特征细化模块,通过低通和高通滤波增强目标的语义和细节信息。并由特征聚合模块对各特征进行融合利用。最后通过在DAVIS2017数据集上的评估,证明所提出模型的有效性。
Video object segmentation is a key task in computer vision and is of great significance to fields such as autonomous driving and video coding.For the video object segmentation,the proposed method utilizes an efficient encoding memory network(EMNet)to achieve semi-supervised video object segmentation.The method includes an adaptive reference frame selection module,a dual path matching module,a feature processing module and a feature aggregation module.The adaptive reference frame selection module takes into account mask confidence and similarity,and selects a reference frame that contains rich information.The dual-path matching module realizes bidirectional and dual-scale matching between query frames and reference frames to improve the accuracy of target feature matching.The feature processing module includes a semantic enhancement module and a feature refinement module,which enhance the semantic and detailed information of the target through low-pass and high-pass filtering.Finally,the feature aggregation module fuses and utilizes each feature.An evaluation is carried out on the DAVIS2017dataset and the result shows that the proposed method is effective.
作者
尹亮
张钊
张宝鹏
YIN Liang;ZHANG Zhao;ZHANG Baopeng(CHN Energy Xinshuo Railway Co.,Ltd.Maintenance Branch office,Beijing 010300,China;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)
出处
《弹箭与制导学报》
北大核心
2024年第3期11-21,共11页
Journal of Projectiles,Rockets,Missiles and Guidance
基金
国家自然科学基金项目(61972027)资助。
关键词
视频目标分割
编码记忆网络
注意力机制
语义分割
深度学习
video object segmentation
encoding memory network
attention mechanism
semantic segmentation
deep learning