Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing com...Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet.展开更多
提出了一种自适应的核密度估计(Kernel density estimation,KDE)运动检测算法.算法首先提出一种自适应前景、背景阈值的双阈值选择方法,用于像素分类.该方法用双阈值克服了单阈值分类存在的不足,阈值的选择能自适应进行,且能适应不同的...提出了一种自适应的核密度估计(Kernel density estimation,KDE)运动检测算法.算法首先提出一种自适应前景、背景阈值的双阈值选择方法,用于像素分类.该方法用双阈值克服了单阈值分类存在的不足,阈值的选择能自适应进行,且能适应不同的场景.在此基础上,本文提出了基于概率的背景更新模型,按照像素的概率来更新背景,并利用帧间差分背景模型和KDE分类结果解决背景更新中的死锁问题,同时检测背景的突然变化.实验证明了所提出方法的适应性和可靠性.展开更多
基金funded by the Natural Science Foundation China(NSFC)under Grant No.62203192.
文摘Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet.
文摘提出了一种自适应的核密度估计(Kernel density estimation,KDE)运动检测算法.算法首先提出一种自适应前景、背景阈值的双阈值选择方法,用于像素分类.该方法用双阈值克服了单阈值分类存在的不足,阈值的选择能自适应进行,且能适应不同的场景.在此基础上,本文提出了基于概率的背景更新模型,按照像素的概率来更新背景,并利用帧间差分背景模型和KDE分类结果解决背景更新中的死锁问题,同时检测背景的突然变化.实验证明了所提出方法的适应性和可靠性.