使用密集弱注意力机制的图像显著性检测被引量：5

Dense weak attention model for salient object detection

导出

摘要目的基于全卷积网络(FCN)模型的显著性检测(SOD)的研究认为,更大的解码网络能实现比小网络更好的检测效果,导致解码阶段参数量庞大。视觉注意力机制一定程度上缓解了模型过大的问题。本文将注意力机制分为强、弱注意力两种:强注意力能为解码提供更强的先验,但风险很大;相反,弱注意力机制风险更小,但提供的先验较弱;基于此提出并验证了采用弱注意力的小型网络架构也能达到大网络的检测精度这一观点。方法本文设计了全局显著性预测和基于弱注意力机制的边缘优化两个阶段,其核心是提出的密集弱注意力模块。它弥补了弱注意力的缺点,仅需少量额外参数,就能提供不弱于强注意力的先验信息。结果相同的实验环境下,提出的模型在5个数据集上取得了总体上更好的检测效果。同时,提出的方法将参数量控制在69.5 MB,检测速度达到了实时32帧/s。实验结果表明,与使用强注意力的检测方法相比,提出的密集弱注意力模块使得检测模型的泛化能力更好。结论本文目标是使用弱注意力机制来提高检测效能,为此设计了兼顾效率和风险的弱注意力模块。弱注意力机制可以提高解码特征的效率,从而压缩模型大小和加快检测速度,并在现有测试集上体现出更好的泛化能力。 Objective Salient object detection,also called saliency detection,aims to localize and segment the most conspicuous and eye-attracting objects or regions in an image.Several applications have benefited from saliency detection,such as image and video compression,context-aware image retargeting,scene parsing,image resizing,object detection,and segmentation.The detection process includes feature extraction and mapping to the saliency value.Most of the state-of-art salient object detection models use extracted features from pre-trained classification convolution network.Related works have shown that models based on fully convolutional networks(FCNs)can encode semantic-rich features,thereby improving the robustness and accuracy of saliency detection.An intuitive opinion states that a large complex network performs better than a small and simple one.Many of the current methods lack efficiency and require numerous storage resources.In the past few years,attention mechanism has been employed to boost and aid many visual tasks in reducing the decoding difficulty and producing lightweight networks.To be more specific,attention mechanism utilizes pre-estimated attention mask and provides useful prior knowledge to the decoding progress.This mechanism eases the mapping from features to the saliency value to eliminate the need to design a large and complex decoding network.However,the wildly used strong attention applies a multiplicative operation between attention mask and features.When the attention mask is normalized,scilicet values range from 0 to 1,where a value of 0 irreversibly wipes out the distribution of certain features.Thus,using strong attention may cause overfitting risks.On the contrary,weak attention applies an additive operation and is less risky and less efficient.Weak attention shifts the features in the feature space and does not destroy the distribution.However,the previously added information can be smoothed by the convolutional operations.The longer the sequence of convolutional layers are,the less effect the attention mask will exert on the decoding features.This work contributes in three aspects:1)We infer about the visual attention mechanism by dividing it into strong and weak attentions before qualitatively explaining how the attention mechanism improves the decoding efficiency.2)We discuss the principles of the two types of attention mechanism.Finally,3)we propose a dense weak attention module that can improve the efficiency of utilizing the features compared with the existing methods.Method Instead of applying the weak attention only at the beginning of the first convolutional layer,we performed the application tautologically and consequently(i.e.,applying weak attention before all decoding convolutional layers).The proposed method is called dense weak attention module(DWAM),which introduces an ideal end-to-end detection model called dense weak attention network.The proposed method inherits an FCN-like architecture,which consists of a sequence of convolutional,pooling,and different activation layers.To fine-tune the VGG-16 network,we divide the decoding network into two parts:global saliency detection and edge optimization using DWAM.A rough saliency map is predicted in the deepest branch of the network.Then,the saliency map is treated as an attention mask and concatenated to shallow features to predict a saliency map with increased resolution.To output side saliency maps,we add cross entropy layers after each side output,a process known as deep supervision,to optimize the network.We discover that weak attention plays an important role in the optimization of the detection result by providing effective prior information.With few additional parameters,we have achieved an improved detection result and detection speed.To achieve a more robust prediction than before,the atrous spatial pyramid pooling is used to enhance the ability of detecting multiscale targets.Result We compared the proposed method with seven FCN-based state-of-the-art techniques on five widely accepted benchmarks,and set three indicators as evaluation criteria:mean absolute error(MAE),F measure,and precision-recall curve.Under the same condition,the proposed model demonstrated more competitive results compared with the other state-of-art methods.The MAE of the proposed method is generally better than that of other methods,which means that DWAM produces more pixel-level accuracy results than the other techniques.DWAM’s F measure is higher by approximately 2%6%than most of the state-of-art methods.In addition,the precision-recall curve shows that DWAM has a slight advantage and better balance between precision and recall metrics than the other techniques.Meanwhile,the model size of the proposed method is only 69.5 MB and the real-time detection speed reaches 32 frame per second.Conclusion In this study,we proposed an efficient and fully convolutional salient object detection model to improve the efficiency of feature decoding and enhance the generalization ability through weak attention mechanism and deep supervision training than other state-of-the-art methods.Compared with the existing methods,the results of the proposed method is more competitive and the detection speed is faster even if the model remained small.

作者项圣凯曹铁勇方正洪施展 Xiang Shengkai;Cao Tieyong;Fang Zheng;Hong Shizhan(Institute of Command and Control Engineering,Army Engineering University,Nanjing 210001,China)

机构地区陆军工程大学指挥控制工程学院

出处《中国图象图形学报》 CSCD 北大核心 2020年第1期136-147,共12页 Journal of Image and Graphics

基金国家自然科学基金项目(61471394) 江苏省优秀青年基金项目(BK20180080).

关键词显著性检测视觉注意力机制编码—解码全卷积网络实时检测 salient object detection(SOD) visual attention mechanism encoder-decoder fully convolutional networks(FCNs) real-time detection

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1杨勇,闫钧华,井庆丰.融合图像显著性与特征点匹配的形变目标跟踪[J].中国图象图形学报,2018,23(3):384-398. 被引量：10
2Ming-Ming Cheng,Qi-Bin Hou,Song-Hai Zhang,Paul L. Rosin.Intelligent Visual Media Processing： When Graphics Meets Vision[J].Journal of Computer Science & Technology,2017,32(1):110-121. 被引量：12
3李军,吕绍和,陈飞,阳国贵,窦勇.结合视觉注意机制与递归神经网络的图像检索[J].中国图象图形学报,2017,22(2):241-248. 被引量：7

二级参考文献5

1赵汉理,聂桂芝,厉旭杰,金小刚,潘志庚.Structure-Aware Nonlocal Optimization Framework for Image Colorization[J].Journal of Computer Science & Technology,2015,30(3):478-488. 被引量：2
2张帆,张新,秦学英,张彩明.Enlarging Image by Constrained Least Square Approach with Shape Preserving[J].Journal of Computer Science & Technology,2015,30(3):489-498. 被引量：4
3Jing Wu,Paul L. Rosin,Xianfang Sun,Ralph R. Martin.Improving Shape from Shading with Interactive Tabu Search[J].Journal of Computer Science & Technology,2016,31(3):450-462. 被引量：3
4Wei Qi,Ming-Ming Cheng,Ali Borji,Huchuan Lu,Lian-Fa Bai.Saliency Rank:Two-stage manifold ranking for salient object detection[J].Computational Visual Media,2015,1(4):309-320. 被引量：5
5Kunihiro Hasegawa,Hideo Saito.Synthesis of a stroboscopic image from a hand-held camera sequence for a sports analysis[J].Computational Visual Media,2016,2(3):277-289. 被引量：2

共引文献26

1Zhi-Feng Xie,Shi Tang,Dong-Jin Huang,You-Dong Ding,Li-Zhuang Ma.Photographic Appearance Enhancement via Detail-Based Dictionary Learning[J].Journal of Computer Science & Technology,2017,32(3):417-429. 被引量：2
2Yu-Jie Liu,Yong-Biao Gao,Ling-Yan Bian,Wen-Ya Wang,Zong-Min Li.How to Wear Beautifully？ Clothing Pair Recommendation[J].Journal of Computer Science & Technology,2018,33(3):522-530. 被引量：5
3仲宝才,张福泉,徐琳.非负矩阵分解耦合视觉多样性的图像检索算法[J].计算机工程与设计,2018,39(10):3201-3207.
4余春艳,徐小丹,钟诗俊.融合去卷积与跳跃嵌套结构的显著性区域检测[J].计算机辅助设计与图形学学报,2018,30(11):2150-2158. 被引量：5
5沈同平,董尹,俞磊.基于知识图谱的我国图像检索研究进展可视化分析[J].通化师范学院学报,2019,40(2):134-141. 被引量：2
6Na Ding,Ye-Peng Liu,Lin-Wei Fan,Cai-Ming Zhang.Single Image Super-Resolution via Dynamic Lightweight Database with Local-Feature Based Interpolation[J].Journal of Computer Science & Technology,2019,34(3):537-549. 被引量：4
7Bo Ren,Jia-Cheng Wu,Ya-Lei Lv,Ming-Ming Cheng,Shao-Ping Lu.Geometry-Aware ICP for Scene Reconstruction from RGB-D Camera[J].Journal of Computer Science & Technology,2019,34(3):581-593. 被引量：2
8Song-Hai Zhang,Shao-Kui Zhang,Yuan Liang,Peter Hall.A Survey of 3D Indoor Scene Synthesis[J].Journal of Computer Science & Technology,2019,34(3):594-608. 被引量：4
9Huaizu JIANG,Ming-Ming CHENG,Shi-Jie LI,Ali BORJI,Jingdong WANG.Joint salient object detection and existence prediction[J].Frontiers of Computer Science,2019,13(4):778-788. 被引量：5
10赵子阳,蒋慕蓉,黄亚群,郝健宇,曾科.结合SFS和双目模型的单幅图像深度估计算法[J].计算机科学,2019,46(B06):161-164. 被引量：3

同被引文献9

1刘娟妮,彭进业,李大湘,王平.基于谱残差和多分辨率分析的显著目标检测[J].中国图象图形学报,2011,16(2):244-249. 被引量：12
2赵丹培,肖腾蛟,史骏,姜志国.基于显著语义模型的机场与油库目标的识别方法[J].计算机辅助设计与图形学学报,2014,26(1):47-55. 被引量：7
3吴迪,朱青松.图像去雾的最新研究进展[J].自动化学报,2015,41(2):221-239. 被引量：214
4苗启广,李宇楠.图像去雾霾算法的研究现状与展望[J].计算机科学,2017,44(11):1-8. 被引量：8
5杨爱萍,赵美琪,宋曹春洋,王金斌.基于色调映射和暗通道融合的弱光图像增强[J].天津大学学报（自然科学与工程技术版）,2018,51(7):768-776. 被引量：5
6方正,曹铁勇,郑云飞,杨吉斌.高效深度特征提取及其在显著性检测中的应用[J].计算机辅助设计与图形学学报,2019,31(2):324-331. 被引量：8
7纪超,黄新波,曹雯,朱永灿,张烨.结合深度学习和全局-局部特征的图像显著区域计算[J].计算机辅助设计与图形学学报,2019,31(10):1838-1846. 被引量：7
8陈凯,王永雄.结合空间注意力多层特征融合显著性检测[J].中国图象图形学报,2020,25(6):1130-1141. 被引量：3
9杨娟,李文静,汪荣贵,薛丽霞.融合感知损失的生成式对抗超分辨率算法[J].中国图象图形学报,2019,0(8):1270-1282. 被引量：8

引证文献5

1陈凯,王永雄.结合空间注意力多层特征融合显著性检测[J].中国图象图形学报,2020,25(6):1130-1141. 被引量：3
2张艺涵,张朝晖,霍丽娜,解滨,王秀青.结合双流特征融合及对抗学习的图像显著性检测[J].计算机辅助设计与图形学学报,2021,33(3):376-384. 被引量：5
3何胜敏,陈志翔.注意力引导特征增强的单图像去雾[J].闽南师范大学学报（自然科学版）,2021,34(3):55-61.
4何伟,潘晨.注意力引导网络的显著性目标检测[J].中国图象图形学报,2022,27(4):1176-1190. 被引量：9
5王杨,曹铁勇,杨吉斌,郑云飞,方正,邓小桐.结合扰动约束的低感知性对抗样本生成方法[J].中国图象图形学报,2022,27(7):2287-2299. 被引量：3

二级引证文献20

1文雅宏,巨琛.基于背景评估的贝叶斯模型显著性检测[J].计算机与现代化,2021(10):63-68.
2何伟,潘晨.注意力引导网络的显著性目标检测[J].中国图象图形学报,2022,27(4):1176-1190. 被引量：9
3徐温程,周之平,程家睿,盖杉.多尺度特征多径自适应复用的显著性目标检测[J].计算机应用研究,2023,40(2):628-633.
4孔力,胡学敏,汪顶,刘艳芳,张龑,陈龙.融合多重注意力机制的人眼注视点预测[J].中国图象图形学报,2022,27(12):3503-3515.
5代胜选,许林峰,刘芳瑜,贺斌.结合语义辅助和边缘特征的显著对象检测[J].中国图象图形学报,2022,27(11):3243-3256.
6郭钰生,钱振兴,张新鹏,柴洪峰.抑制图像非语义信息的通用后门防御策略[J].中国图象图形学报,2023,28(3):836-849.
7陈斌,朱晋宁.双流增强融合网络微表情识别[J].智能系统学报,2023,18(2):360-371. 被引量：3
8闫於虎,王永雄,潘志群.结合Transformer的显著性目标检测[J].信息与控制,2023,52(3):382-390.
9朱仲杰,崔伟锋,白永强,井维一,金敏红.宏微观信息增强与色彩校正的高效色调映射[J].中国图象图形学报,2023,28(9):2833-2843.
10王培珍,杨志豪,汪正才,薛子邯,刘林,张代林.嵌入改进注意力机制的镜质组显微亚组分轻量级网络识别模型[J].煤炭学报,2023,48(9):3451-3459.

1薛炜,刘惠义.基于一种视觉注意力机制的图像描述方法[J].信息技术,2020,44(1):63-66. 被引量：1
2凌兴宏,李杰,朱斐,刘全,伏玉琛.基于双重注意力机制的异步优势行动者评论家算法[J].计算机学报,2020,43(1):93-106. 被引量：4
3代明璇.碳排放权交易机制风险及法制对策研究[J].法制与社会（旬刊）,2019,0(28):27-28.
4王运伟,刘中强,夏熙学,杨国鹏,张玉邦.联合检测HIF-1α、ET-1、VEGF、CEA、CA153在乳腺癌转移及预后评估中的作用[J].标记免疫分析与临床,2020,27(1):118-122. 被引量：19
5万倩,赵明,赵翠.用户行为感知与网络边缘内容分发优化策略研究[J].广播电视信息,2020,27(3):44-50. 被引量：1
6任凤雷,何昕,魏仲慧,吕游,李沐雨.基于DeepLabV3+与超像素优化的语义分割[J].光学精密工程,2019,27(12):2722-2729. 被引量：18
7伊力扎提·麦麦提,买合甫来提·坎吉.小学四年级维吾尔文阅读障碍儿童检出率及其阅读水平研究[J].兰州教育学院学报,2020,0(2):85-87. 被引量：1
8王洋,邵利平,陆海.基于字符画的生成式伪装方法[J].计算机技术与发展,2019,29(12):104-110. 被引量：1
9邓一姣,张凤荔,陈学勤,艾擎,余苏喆.面向跨模态检索的协同注意力网络模型[J].计算机科学,2020,47(4):54-59. 被引量：10
10苏依拉,张振,仁庆道尔吉,牛向华,高芬,赵亚平.Transformer-CRF词切分方法在蒙汉机器翻译中的应用[J].中文信息学报,2019,33(10):38-46. 被引量：4

中国图象图形学报

2020年第1期

浏览历史

内容加载中请稍等...

使用密集弱注意力机制的图像显著性检测被引量：5

参考文献3

二级参考文献5

共引文献26

同被引文献9

引证文献5

二级引证文献20

相关作者

相关机构

相关主题

浏览历史

使用密集弱注意力机制的图像显著性检测 被引量：5

参考文献3

二级参考文献5

共引文献26

同被引文献9

引证文献5

二级引证文献20

相关作者

相关机构

相关主题

浏览历史

使用密集弱注意力机制的图像显著性检测被引量：5