摘要
针对在佩戴口罩等有遮挡条件下的人脸检测问题,提出了多尺度注意力学习的Faster R-CNN(MSAF R-CNN)人脸检测模型.首先,为充分考虑人脸目标多尺度信息,相较于原始Faster R-CNN框架,引入Res2Net分组残差结构,获取更细粒度的特征表征;其次,基于空间-通道注意力结构改进的Res2Net模块,结合注意力机制自适应学习目标不同尺度特征;最后,为学习目标的全局信息并减轻过拟合现象,在模型顶端嵌入加权空间金字塔池化网络,采用由粗到细的方式进行特征尺度划分.在AIZOO和FMDD两个人脸数据集上的实验结果表明:所提出MSAF R-CNN模型对佩戴口罩的人脸检测准确率分别达到90.37%和90.11%,验证了模型的可行性和有效性.
For the purpose of masked face detection,a multi-scale attention-driven faster region-based convolutional neural network(MSAF R-CNN)model is proposed.First,given the Faster R-CNN model architecture and the multi-scale information of the face,Res2Net,a grouped-residual structure,is introduced to model more fine-grained features.Then,inspired by the attention mechanism,a novel spatial-channel attention Res2Net(SCA-Res2Net)module is developed to learn the multi-scale features adaptively.Finally,to further learn the global feature representation and ease the overfitting problem,the weighted spatial pyramid pooling network is embedded on the top of the model,which can segment the feature maps into different groups from finer to coarser scales.Experimental results on the AIZOO and FMDD datasets show that the accuracy of masked face detection with the proposed MSAF R-CNN model can reach 90.37%and 90.11%,respectively,thus verifying the feasibility and effectiveness of the proposed model.
作者
李泽琛
李恒超
胡文帅
杨金玉
华泽玺
LI Zechen;LI Hengchao;HU Wenshuai;YANG Jinyu;HUA Zexi(School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,China)
出处
《西南交通大学学报》
EI
CSCD
北大核心
2021年第5期1002-1010,共9页
Journal of Southwest Jiaotong University
基金
国家自然科学基金(61871335)
中央高校基本业务费专项资金(2682020XG02,2682020ZT35)
国家重点研发计划(2020YFB1711902)。
关键词
口罩人脸
深度学习
注意力机制
多尺度学习
特征融合
目标检测
masked face
deep learning
attention mechanism
multi-scale learning
feature fusion
object detection