摘要
场景识别在视觉信息检索、图像分割、图像/视频理解等任务中有着关键性的作用。随着深度学习理论的发展,尤其是卷积神经网络(CNN)能识别图像中具有辨别性的物体,这大大提高了场景识别的能力。为了实现智能轮椅床等家庭服务机器人的自主场景识别,针对在移动端或嵌入式设备计算资源和内存需求有限的情况下,网络输出辨别性物体单一而造成场景识别率低的问题,提出一种基于多尺度特征提取和注意力模块的室内场景识别方法。该方法基于Mobile-NetV2轻量化网络,从网络中选择不同分支提取不同尺度的特征。为关注场景中更有辨别性的特征,在分支中加入了MRLA-Light注意力模块,仿真结果表明准确率有明显提高,在MIT Indoor 67数据集、Scene 15数据集上的准确率分别为86.3%和94.3%,相比于同类型网络有更高的准确率。
Scene recognition plays an important role in the task of visual information retrieval,segmentation and image/video un⁃derstanding.With the development of deep learning theory,convolutional neural networks(CNN)greatly improve the ability of scene recognition by recognizing discriminative objects in images.In order to realize autonomous scene recognition for home ser⁃vice robots such as intelligent wheelchair beds,aiming at the condition of limited computing resources and memory requirements of mobile terminals or embedded devices,which leads to low scene recognition rate due to the single discriminative output from the network,an indoor scene recognition method based on multi-scale feature extraction and attention module is proposed.The method is based on MobileNetV2,which selects different branches from the network and extracts features at different scales.To focus on more discriminative features in the scene,the MRLA-Light attention module is added to the branches.The simulation results show that the accuracy is obviously improved,and the accuracy of tests on MIT Indoor 67 and Scene 15 scene datasets reaches 86.3%and 94.3%respectively,which is higher than the same type of networks.
作者
岳有军
张远锟
赵辉
王红君
YUE Youjun;ZHANG Yuankun;ZHAO Hui;WANG Hongjun(School of Electrical Engineering and Automation,Tianjin University of Technology,Tianjin 300384,China;Tianjin Key Laboratory for Control Theory&Applications in Complicated Systems,Tianjin 300384,China)
出处
《计算机与现代化》
2024年第8期37-42,共6页
Computer and Modernization
基金
天津市科技支撑计划项目(19YFZCSN00360)。
关键词
室内场景识别
轻量化网络
注意力模块
特征提取
indoor scene recognition
lightweight network
attention module
feature extraction