摘要
在半监督的分割任务中,单镜头视频对象分割(OSVOS)方法根据第一帧的对象标记掩模进行引导,从视频画面中分离出后续帧中的前景对象。虽然取得了令人印象深刻的分割结果,但其不适用于前景对象外观变化显著或前景对象与背景外观相似的情形。针对这些问题,提出一种用于视频对象分割的仿U形网络结构。将注意力机制加入到此网络的编码器和解码器之间,以便在特征图之间建立关联来产生全局语义信息。同时,优化损失函数,进一步解决了类别间的不平衡问题,提高了模型的鲁棒性。此外,还将多尺度预测与全连接条件随机场(FC/Dense CRF)结合,提高了分割结果边缘的平滑度。在具有挑战性的DAVIS 2016数据集上进行了大量实验,此方法与其他最先进方法相比获得了具有竞争力的分割结果。
For the semi-supervised video object segmentation method, the one-shot video object segmentation(OSVOS) method is guided by the object marking mask of the first frame to separate the foreground objects in the subsequent frames from the video. Despite the impressive segmentation results, this method is not applicable to cases where the appearance of foreground objects changes significantly or the appearances of foreground objects and background are similar. To solve these problems, an imitation U-shaped network structure for video object segmentation was proposed. The attention mechanism was added between the encoder and decoder of this network,thus establishing association between feature maps to generate global semantic information. At the same time, the loss function was optimized to further solve the imbalance between categories and improve the robustness of the model. In addition, multi-scale prediction was combined with fully connected conditional random field(FC/Dense CRF) to improve the smoothness of the edge of segmentation results. A large number of experiments were carried out on the challenging DAVIS 2016 dataset, and the proposed method obtained more competitive segmentation results than the most advanced ones.
作者
黄志勇
韩莎莎
陈致君
姚玉
熊彪
马凯
HUANG Zhi-yong;HAN Sha-sha;CHEN Zhi-jun;YAO Yu;XIONG Biao;MA Kai(College of Computer and Information Technology,China Three Gorges University,Yichang Hubei 443000,China)
出处
《图学学报》
CSCD
北大核心
2023年第1期104-111,共8页
Journal of Graphics
基金
国家自然科学基金项目(61871258)。
关键词
半监督视频对象分割
注意力机制
损失函数
多尺度特征
全连接条件随机场
semi-supervised video object segmentation
attention mechanism
loss function
multi-scale feature
fully connected conditional random field