摘要
深度学习依赖于大数据在很多的任务中取得巨大成功,但目前大部分方法都依赖于严格标注的数据,或者假定仅含一个物体大致位于图片近中心位置且背景较少。而现实场景中背景复杂,出现的物体多样,增加了分类的难度,而且标注的代价很大。本文关注于弱监督场景下的分类任务,提出了基于注意力机制(Attention)结合递归神经网络的深度模型,利用图片级的标注进行多标号学习,利用损失函数进行梯度下降训练自动调整关注区域,使模型每次关注图片的局域区域,并在数据集PASCAL VOC 2007/2012上验证算法的有效性,与其他方法相比具有更强的可解释性。
Deep learning has become new state-of-the-art framework in many task in big data circumstance.Most of methods need full annotated data or assume only an object in the image with simple background.However,complex background,more than one object in the image and expensive full annotation in the reality,object recognition becomes more challenging.Here,we propose a deep-model-based attention mechanism and recurrent neural network.It trains the network end-to-end on multi-label data with image-level label.The glimpses change along with stochastic gradient descent and focus on different local region in every step.Finally,the effectiveness of the proposed algorithm is verified on the PASCAL VOC 2007 and 2012 datasets.Results show that the network is easily interpretable than other methods.
作者
张文
谭晓阳
Zhang Wen;Tan Xiaoyang(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing,211106,China)
出处
《数据采集与处理》
CSCD
北大核心
2018年第5期801-808,共8页
Journal of Data Acquisition and Processing
基金
中央高校基本科研业务费专项资金(NP2017108)资助项目
关键词
弱监督
多标号
注意力
深度学习
weakly supervised
multi-label
attention
deep learning