期刊文献+

一种基于ConvMixer骨干的显著性目标检测模型

A saliency object detection model based on ConvMixer backbone
下载PDF
导出
摘要 显著性目标检测(Saliency Object Detection,SOD)算法多采用基于卷积神经网络(Convolutional Neural Network,CNN)的骨干网络提取特征,然而CNN无法获取图像的长范围特征依赖。视觉转换器(Vision Transformer,ViT)将图像划分为图块(patch),通过Transformer在patch之间传播全局上下文信息获得长范围特征依赖,但Transformer的自注意力层具有二次方的时间复杂性。因此,提出一种低复杂性的基于patch的SOD算法CM-PoolNet,对经典的显著性目标检测PoolNet模型的骨干网络进行改进,使用卷积模型ConvMixer替换VGG和RestNet,提出新的特征融合方法。基于U型结构,编码器对输入图像进行Patch Embedding,送入重复堆叠的由深度可分离卷积和膨胀卷积构成的ConvMixer特征提取器中。为解码器设计了基于patch的特征融合模块。设计了BCE、SSIM和IOU 3种损失,引导模型在像素级、图块级、特征图级3级层次中学习输入图像和真值图像之间的转换。在DUTS数据集和ECSSD数据集上进行实验,结果表明:提出的模型能够有效地分割突出的目标区域,并且准确预测具有清晰边界的精细结构。 Saliency object detection(SOD)algorithms mostly use a backbone network based on Convolutional Neural Network(CNN)to extract features.However,CNN cannot obtain long-range feature dependence of images.Vision Transformer(ViT)divides the image into patches and propagates the global context information between patches through the transformer to obtain long-range feature dependence,but the transformer s self attention layer has quadratic time complexity.Therefore,we propose a low-complexity patch-based SOD algorithm CM-PoolNet,which improves the backbone network of the classical PoolNet model for saliency target detection,replaces VGG and ResNet using the convolutional model ConvMixer and proposes a new feature fusion method.Specifically,based on the U-shaped structure,the encoder performs Patch Embedding on the input image and feeds it into the ConvMixer feature extractor consisting of deep detachable convolution and dilatation convolution,which is stacked repeatedly.A patch-based feature fusion module is designed for the decoder.Three kinds of losses,BCE,SSIM and IOU,are designed to guide the model to learn the conversion between the input image and the truth image at the pixel level,block level and feature level.Experiments on DUTS datasets and ECSSD datasets show that the proposed model can effectively segment prominent target areas and accurately predict fine structures with clear boundaries.
作者 张斯博 朱敬华 奚赫然 杜欣月 ZHANG Si-Bo;ZHU Jing-Hua;XI He-Ran;DU Xin-Yue(School of Computer Science and Technology,Heilongjiang University,Habin 150080,China)
出处 《黑龙江大学工程学报(中英俄文)》 2024年第1期48-57,共10页 Journal of Engineering of Heilongjiang University
基金 国家自然科学基金项目(82374626)。
关键词 显著性目标检测 补丁嵌入 混合损失函数 PoolNet ConvMixer saliency object detection patch embedding mixed loss function PoolNet ConvMixer
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部