基于编码记忆网络的半监督视频目标分割方法

Method of Semi-supervised Video Object Segmentation Based on Encoding Memory Network

下载PDF

导出

摘要视频目标分割是计算机视觉中的一项关键任务,在自动驾驶、视频编码等领域具有重要意义。针对视频目标分割任务,提出使用一种高效的编码记忆网络(EMNet)实现半监督视频目标分割任务。该方法包含自适应参考帧选取模块、双路径匹配模块、特征处理模块以及特征聚合模块。自适应参考帧选取模块综合考虑掩码置信度和相似度,选择包含丰富信息的参考帧。双路径匹配模块实现查询帧和参考帧之间的双向和双尺度匹配,提高目标特征匹配准确率。特征处理模块分别包含语义强化模块和特征细化模块,通过低通和高通滤波增强目标的语义和细节信息。并由特征聚合模块对各特征进行融合利用。最后通过在DAVIS2017数据集上的评估,证明所提出模型的有效性。 Video object segmentation is a key task in computer vision and is of great significance to fields such as autonomous driving and video coding.For the video object segmentation,the proposed method utilizes an efficient encoding memory network(EMNet)to achieve semi-supervised video object segmentation.The method includes an adaptive reference frame selection module,a dual path matching module,a feature processing module and a feature aggregation module.The adaptive reference frame selection module takes into account mask confidence and similarity,and selects a reference frame that contains rich information.The dual-path matching module realizes bidirectional and dual-scale matching between query frames and reference frames to improve the accuracy of target feature matching.The feature processing module includes a semantic enhancement module and a feature refinement module,which enhance the semantic and detailed information of the target through low-pass and high-pass filtering.Finally,the feature aggregation module fuses and utilizes each feature.An evaluation is carried out on the DAVIS2017dataset and the result shows that the proposed method is effective.

作者尹亮张钊张宝鹏 YIN Liang;ZHANG Zhao;ZHANG Baopeng(CHN Energy Xinshuo Railway Co.,Ltd.Maintenance Branch office,Beijing 010300,China;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)

机构地区国能新朔铁路有限责任公司机务分公司北京交通大学计算机与信息技术学院

出处《弹箭与制导学报》北大核心 2024年第3期11-21,共11页 Journal of Projectiles,Rockets,Missiles and Guidance

基金国家自然科学基金项目(61972027)资助。

关键词视频目标分割编码记忆网络注意力机制语义分割深度学习 video object segmentation encoding memory network attention mechanism semantic segmentation deep learning

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

1杨利中,魏小彬,高春娟.基于人工智能的音视频参考帧自适应算法[J].广播电视网络,2022,29(2):99-101. 被引量：1
2赵滨,兰南.宜宾市叙州区智慧综合管理服务平台的创新实践[J].城市管理与科技,2024,25(2):73-74.
3王润民,凡海金,何佳浚,徐志刚,赵祥模.无信控交叉口网联车辆动态碰撞风险检测与预警策略[J].计算机工程与应用,2024,60(13):330-337.
4袁姮,耿仪坤.特征细化和多尺度注意力的Transformer图像去噪网络[J].计算机科学与探索,2024,18(7):1838-1851. 被引量：1
5潘慧,杨锦润.基于双重细化局部二值模式的纹理图像分类[J].价值工程,2024,43(19):135-138.
6石腾,许波峰,陈鹏,张金波,刘加英.基于机器视觉的风电机组叶片多类型损伤检测方法研究[J].太阳能学报,2024,45(6):487-494.
7姜文涛,陈晨,张晟翀.空间位置矫正的稀疏特征图像分类网络[J].光电工程,2024,51(5):66-82.
8郑千惠,孔玲君.混合CNN和ViT的自监督知识蒸馏单目深度估计方法[J].建模与仿真,2024,13(3):2868-2880.
9李可新,何丽,刘哲凝,钟润豪.基于跨模态特征融合的RGB-D显著性目标检测[J].国外电子测量技术,2024,43(6):59-67.
10马驰,韩天立.基于优化图注意力网络的解耦式交通流预测仿真模型[J].建模与仿真,2024,13(3):2280-2294.

弹箭与制导学报

2024年第3期

浏览历史

内容加载中请稍等...

基于编码记忆网络的半监督视频目标分割方法

相关作者

相关机构

相关主题

浏览历史