摘要
由于安防设备硬件条件等因素制约,在视频监控场景下的低清人脸检测中注重模型在检测精度、速度以及占用内存大小等方面的权衡已然是必须考虑的问题。针对此问题,将可变形卷积(Deformable convolution,DC)和Lambda层进行融合,提出一种轻型尺度自适应深度网络的低清人脸检测模型DLFace。首先借鉴RetinaFace算法,使用改进后的深度可分离卷积能够有效防止训练过程中信息丢失;其次将改进后的可变形卷积引入骨干网络和SSH(Single stage headless)检测模块,通过增强感受野适应人脸多因素的变化;最后在骨干网络高层引入Lambda层,有效挖掘语义和位置信息,形成更加丰富的特征表示。在WiderFace数据集上的实验结果表明,DLFace实现了性能和速度的平衡,在不同场景下均验证了DLFace的优越性,表明DLFace能较好地适用于视频监控场景下的低清人脸检测任务。
As for low-resolution face detection in real-world video surveillance,achieving balance in terms of speed,accuracy,and memory consumption is of great importance due to the hardware constraints.Towards the problem,inspired by the more recent RetinaFace this paper proposes a light-weight scaleadaptive deep face detection model,termed as DLFace.Firstly,the improved depthwise separable convolution can effectively prevent information loss during training.Secondly,the improved deformable convolution is introduced into the backbone network and single stage headless(SSH)face detector,so as to enlarge the receptive field while also to adapt to facial changes such as expression,pose and so on.Finally,a Lambda layer is introduced in the high level of the backbone network,attempting to effectively explore the semantic and location information to form a richer representation of facial features.Experimental results on the WiderFace dataset show that DLFace has achieved a comparable or even better performance than existing light-weight face detection methods.Meanwhile,DLFace also achieves a better performance balance than most of previous methods in prediction efficiency and effectiveness.
作者
胡洪明
邵文泽
李金叶
葛琦
邓海松
HU Hongming;SHAO Wenze;LI Jinye;GE Qi;DENG Haisong(College of Telecommunications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;College of Statistics and Mathematics,Nanjing Audit University,Nanjing 211815,China)
出处
《数据采集与处理》
CSCD
北大核心
2022年第5期1070-1083,共14页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(61771250,61972213,11901299)。
关键词
人脸检测
可变形卷积
轻量化
多尺度特征融合
face detection
deformable convolution
lightweight
multi-scale feature fusion