FastFace:实时鲁棒的人脸检测算法被引量：9

FastFace: a real-time robust algorithm for face detection

导出

摘要目的尽管基于深度神经网络的人脸检测器在检测精度上有了极大的提升,但其代价是必须依赖强大的计算资源。如何在CPU上取得较高的检测精度的同时达到实时的检测速度是一个巨大的挑战。针对非约束性条件下的快速鲁棒的人脸检测问题,提出一种基于轻量级神经网络的检测方法。方法受轻量级网络MobileNet的启发,本文算法采用通道分离的卷积方式进行特征提取,并结合Inception和残差连接的思想,构建若干特征提取模块,最终训练出一个简单高效的特征提取网络;在检测时,采用One-Stage的检测策略,在骨干网络的若干不同层级上使用卷积的同时进行目标区域的分类和定位;在进行目标区域精调时,需要先在对应的特征层上预设先验框,然后再使用边界框回归算法调整先验框的位置和大小,使之接近真实框的位置。为了减少先验框的数量以节省模型参数,本算法针对人脸目标框的特点设置先验框。结果基于TensorFlow深度学习库构建和训练本文的检测模型,在FDDB数据集上对其进行测试,并与若干经典算法对比了检测速度和精度。相较于多任务级联卷积网络(MTCNN)等典型的深度学习方法,本文算法在CPU上将检测速度提升到25帧/s,同时平均精度(mAP)保持在0.892,高于大多数传统算法。实验结果表明本文方法能实现在CPU上的实时、高精度检测。结论提出了一种基于轻量级网络模型的人脸检测方法,以简单高效的卷积模块为基础构建骨干网络,并在检测时针对人脸比例特征设置合理的先验框。在非约束性条件以及有限计算资源条件下,该方法不仅在精度上表现良好,而且具有较快的检测速度,是一种鲁棒的检测方法。 Objective Face detection is a crucial step in various problems involving verification, identification, expression analysis. Although state-of-the-art convolutional neural network(CNN)-based face detectors exhibit improved detection accuracy, they are unsuitable to run on CPU devices because they are computationally prohibitive. Achieving high detection accuracy on CPUs and realizing real-time detection remain challenging. One of the reasons is that most back-bone networks in current face detection models are transferred from generic object detection networks. The models themselves are large and contain redundant information while modeling human faces. Moreover, the large search space of possible face locations and the variations of face sizes in one image require large computation for robust detection. Aiming at the fast and robust face detection problem under unconstrained conditions, this paper proposes a detection method based on a self-designed lightweight neural network. Method The instinct is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. Efforts have been made to design compact networks. Results proved that changing the direction of convolution can save parameters in neural networks. In this study, depth-wise separable convolution, which was first introduced in MobileNets, is used for feature extraction. We then combine the idea of inception and residual connection to construct several feature extraction modules, which finally consist of our backbone network. Unlike standard convolutions, depth-wise separable convolution uses depth-wise convolution followed by 1×1 point-wise convolution to implement convolution operation. When the kernel size is 3×3, depth-wise separable convolution uses 8 to 9 times less computation than standard convolution. Given that the inception modules and residual connections have become essential in new networks, we also use them in our model to enrich receptive field. In our backbone network, depth-wise separable convolution is used to extract features;residual connection and inception modules are introduced to feature extraction module to enrich receptive fields. We design our own bottleneck modules(with different strides), inception modules, and residual inception modules based on depth-wise separable convolution in contrast to existing convolutional modules. The modules are then concatenated to form a complete network model. Inception modules, which are composed of bottleneck modules in parallel, aims at rapidly reducing the size of the input image. As the name suggests, residual inception modules are inception modules with residual connections and can decrease the sizes of feature maps and enrich receptive fields. Detection is carried out on multiple feature layers to increase the robustness to scale variants of faces in input images. While detecting faces, One-Stage detection strategy is applied for fast face detection. We conduct detection at three different levels of feature maps in a single feed forward manner, that is, we simultaneously classify and regress object areas at above-mentioned feature maps by using convolutions. When fine tuning the exact locations of the object areas, we need to set priori boxes, namely, default anchors, on the corresponding feature layers, and then use the bounding box regression algorithm to adjust the location and size of the anchors to make them closer to the locations of the ground truth. To reduce the number of default anchors and save model parameters, we set the default anchors according to the priori knowledge of face box ratio. Result We conduct and train our detection model based on TensorFlow deep learning library. Our model is trained on the WIDER FACE dataset with several data augmentation tricks. We test our model on the Face Detection Dataset and Benchmark and compare its mean average precision(mAP) and detection speed with several classical algorithms. The proposed method achieves real-time and high-precision detection on the CPU. Compared with typical deep learning methods, such as multitask cascaded convolutional networks(MTCNN), our method exhibits detection speed that increases to 25 frames per second on CPUs and mAP maintained at 0.892, which is higher than those obtained using most traditional methods and reaches a relatively high precision level. Conclusion Face detectors based on deep learning exhibit improved detection accuracy. However, the high computational complexity of these methods leads to their very slow detection speed on CPUs. This paper presents a fast and robust face detection method based on lightweight neural network. A simple and efficient convolution neural network is constructed by depth-wise separable convolution, and the ideas of inception and residual connection are also used to keep the model lightweight and powerful. The default anchors are set according to the characteristics of the face boxes while applying one-stage detection strategy. Experiments demonstrate that the proposed method can significantly reduce redundant operation in the detection process. With a detection speed of 25 frames/s on CPUs, the face detection method is robust and not only performs well in terms of accuracy but also shows fast detection speed with limited computing resources under unconstrained conditions.

作者李启运纪庆革洪赛丁 Li Qiyun;Ji Qingge;Hong Saiding(School of Data and Computer Science,Sun Yat-sen University,Guangzhou 510006,China;Guangdong Province Key Laboratory of Big Data Analysis and Processing,Guangzhou 510006,China)

机构地区中山大学数据科学与计算机学院广东省大数据分析与处理重点实验室

出处《中国图象图形学报》 CSCD 北大核心 2019年第10期1761-1771,共11页 Journal of Image and Graphics

基金广东省自然科学基金项目(2016A030313288) 广东省重点领域研发计划项目(2018B010107005)~~

关键词计算机视觉人脸检测卷积神经网络轻量级模型 One-Stage检测先验框 computer vision face detection convolutional neural network(CNN) compact model One-Stage detection default anchor

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1张剑,何骅,詹小四,肖俊.结合特征适配与拉普拉斯形变的3维人脸重建[J].中国图象图形学报,2014,19(9):1349-1359. 被引量：5
2伍凯,朱恒亮,郝阳阳,马利庄.级联回归的多姿态人脸配准[J].中国图象图形学报,2017,22(2):257-264. 被引量：15
3王晓华,李瑞静,胡敏,任福继.融合局部特征的面部遮挡表情识别[J].中国图象图形学报,2016,21(11):1473-1482. 被引量：21

二级参考文献11

1柴秀娟,山世光,卿来云,陈熙霖,高文.基于3D人脸重建的光照、姿态不变人脸识别[J].软件学报,2006,17(3):525-534. 被引量：54
2刘晓旻,谭华春,章毓晋.人脸表情识别研究的新进展[J].中国图象图形学报,2006,11(10):1359-1368. 被引量：62
3ZHANG Jian,ZHUANG Yue-ting.Sample based 3D face reconstruction from a single frontal image by adaptive locally linear embedding[J].Journal of Zhejiang University-Science A(Applied Physics & Engineering),2007,8(4):550-558. 被引量：1
4薛雨丽,毛峡,郭叶,吕善伟.人机交互中的人脸表情识别研究进展[J].中国图象图形学报,2009,14(5):764-772. 被引量：49
5付晓峰,韦巍.基于多尺度中心化二值模式的人脸表情识别[J].控制理论与应用,2009,26(6):629-633. 被引量：9
6薛雨丽,毛峡,Caleanu Catalin-Daniel,吕善伟.遮挡条件下的鲁棒表情识别方法[J].北京航空航天大学学报,2010,36(4):429-433. 被引量：13
7张剑.融合SFM和动态纹理映射的视频流三维表情重建[J].计算机辅助设计与图形学学报,2010,22(6):949-958. 被引量：9
8黄武,姚淑波,关胜亮,夏时洪.利用控制线段的实时图像变形方法[J].计算机辅助设计与图形学学报,2010,22(12):2067-2072. 被引量：4
9刘帅师,田彦涛,万川.基于Gabor多方向特征融合与分块直方图的人脸表情识别方法[J].自动化学报,2011,37(12):1455-1463. 被引量：76
10赵恒,俞鹏.基于主动表观模型姿态矫正和局部加权匹配人脸识别[J].中国图象图形学报,2013,18(12):1582-1586. 被引量：13

共引文献38

1侯景严,宋焕生,梁浩翔,贾金明,戴喆.基于前后端交互的人脸识别系统[J].计算机系统应用,2020(10):89-96. 被引量：3
2许霖庆.高新技术在园艺上的应用（二）[J].花卉,2000(3):10-20.
3赵楚雄.老年急性癌性低位肠梗阻86例诊治体会[J].广东医学,2000,21(5):419-420.
4任帅,赵祥模,张弢,石方夏,慕德俊.基于局部高度与Mean Shift的三维模型信息隐藏算法[J].计算机科学,2017,44(3):187-191. 被引量：1
5杨恢先,刘建,张孟娟,周彤彤.双空间局部方向模式的人脸识别[J].中国图象图形学报,2017,22(11):1493-1502. 被引量：4
6孙雅琪,邹祎,赵辉煌.基于Java的人脸检测系统设计与开发[J].信息系统工程,2018,0(3):34-35. 被引量：1
7杨恢先,刘建,张孟娟,曾金芳.基于正交梯度差局部方向模式的人脸识别算法[J].激光与光电子学进展,2018,55(4):144-150. 被引量：1
8王晓华,陈影,胡敏,任福继.基于AR-WLD和分块相似度加权的遮挡表情识别[J].激光与光电子学进展,2018,55(4):177-184. 被引量：3
9胡敏,滕文娣,王晓华,许良凤,杨娟.融合局部纹理和形状特征的人脸表情识别[J].电子与信息学报,2018,40(6):1338-1344. 被引量：23
10于丽.结合局部特征的遥感飞机目标识别[J].现代计算机（中旬刊）,2018(8):53-57.

同被引文献80

1张银胜,杨宇龙,吉茹,蓝天鹤,单慧琳.改进YOLOv5s的风力涡轮机表面缺陷检测[J].电子测量与仪器学报,2023,37(1):40-49. 被引量：12
2徐战武,朱淼良.肤色检测最优空间[J].计算机辅助设计与图形学学报,2006,18(9):1350-1356. 被引量：15
3徐战武,朱淼良.基于颜色的皮肤检测综述[J].中国图象图形学报,2007,12(3):377-388. 被引量：29
4王阿川,陈海涛.基于离散余弦变换的鲁棒感知图像哈希技术[J].中国安全科学学报,2009,19(4):91-96. 被引量：9
5尹宝才,孙艳丰,王成章,盖赟.BJUT-3D三维人脸数据库及其处理技术[J].计算机研究与发展,2009,46(6):1009-1018. 被引量：22
6阮锦新,尹俊勋.基于人脸特征和AdaBoost算法的多姿态人脸检测[J].计算机应用,2010,30(4):967-970. 被引量：23
7张海涛,李美霖,董帅含.两层级联卷积神经网络的人脸检测[J].中国图象图形学报,2019,24(2):203-214. 被引量：15
8何清,王震坤.正态性检验方法在教学研究中的应用[J].高等理科教育,2014(4):18-21. 被引量：24
9竺乐庆,张大兴,张真.基于韦伯局部描述子和颜色直方图的鳞翅目昆虫翅图像特征描述与种类识别[J].昆虫学报,2015,58(4):419-426. 被引量：6
10李子印,朱明凌,陈柱.融合图像感知哈希技术的运动目标跟踪[J].中国图象图形学报,2015,20(6):795-804. 被引量：10

引证文献9

1李章维,胡安顺,王晓飞.基于视觉的目标检测方法综述[J].计算机工程与应用,2020,56(8):1-9. 被引量：56
2邓良,许庚林,李梦杰,陈章进.基于深度学习与多哈希相似度加权实现快速人脸识别[J].计算机科学,2020,47(9):163-168. 被引量：13
3段燕飞,刘胤田,王瑞祥,咬登国,张航.非约束环境下的实时人脸检测方法[J].应用科技,2021,48(3):21-26.
4何婧媛,谢生龙,田原,田琴琴.多尺度特征融合的目标检测算法[J].河南科学,2021,39(7):1045-1051. 被引量：1
5徐亚丽,赵俊莉,吕智涵,张志梅,李劲华,潘振宽.深度学习人脸特征点自动定位综述[J].中国图象图形学报,2021,26(11):2630-2644. 被引量：2
6蔡兴泉,阮瓒茜,孙海燕.基于YOLOv3和MobileNetv2的银行卡号识别方法[J].计算机辅助设计与图形学学报,2022,34(1):142-151. 被引量：4
7解梦达,孙鹏,张志豪,郎宇博,周纯冰,单大国.类肤色背景下的人脸追踪改进算法[J].计算机工程与应用,2022,58(18):205-217. 被引量：2
8孙海燕,陈云博,封丁惟,王通,蔡兴泉.基于注意力模型和轻量化YOLOv4的林业害虫检测方法[J].计算机应用,2022,42(11):3580-3587. 被引量：6
9江祥奎,杜遥遥,胡浩昌.一种改进YOLOv5s小目标无人机实时检测算法[J].西安邮电大学学报,2023,28(3):88-96. 被引量：2

二级引证文献86

1彭道刚,潘俊臻,王丹豪,胡捷.基于改进YOLO v5的电厂管道油液泄漏检测[J].电子测量与仪器学报,2022,36(12):200-209. 被引量：20
2缪飞,神户忠,荻野浩雄,长谷一史.电子束CT检测冠状动脉钙化指数价值的评价[J].上海医学,2000,23(5):317-319.
3官巍,马俊峰,马力.基于卷积神经网络的手势识别网络[J].西安邮电大学学报,2019,24(6):80-84. 被引量：5
4赵伟,王正平,张晓辉,向乾,贺云涛.面向疫情防控的无人机关键技术综述[J].无人系统技术,2020,3(3):8-18. 被引量：9
5多功昊,王紫聪,张航.图像分割在鱼苗自动计数系统中的应用[J].农业技术与装备,2020(7):22-24. 被引量：2
6岳晓新,贾君霞,陈喜东,李广安.改进YOLO V3的道路小目标检测[J].计算机工程与应用,2020,56(21):218-223. 被引量：27
7李昕昕,杨林.面向复杂道路场景小尺度行人的实时检测算法[J].计算机工程与应用,2020,56(22):124-131. 被引量：6
8王沣.改进yolov5 的口罩和安全帽佩戴人工智能检测识别算法[J].建筑与预算,2020(11):67-69. 被引量：33
9刘津龙,贾郭军.基于K-Means算法的SSD-Mobilenet模型优化研究[J].信息技术与网络安全,2021,40(1):37-44. 被引量：1
10罗会兰,彭珊,陈鸿坤.目标检测难点问题最新研究进展综述[J].计算机工程与应用,2021,57(5):36-46. 被引量：15

1周慧娟,张强,刘羽,王旭阳,柳颖.基于YOLO2的地铁进站客流人脸检测方法[J].计算机与现代化,2019,0(10):76-82. 被引量：11
2任立国,朱桂.泛在电力物联网的概念层、特征层和落地层[J].江西电力,2019,43(8):26-28. 被引量：4
3陈胜,李磊.基于大数据逻辑回归算法的乳腺癌诊断模型研究[J].科技经济导刊,2019(24):8-10. 被引量：4
4徐尧瑶,廖文娟.医务社工参与康复护理对骨科手术患者的影响研究[J].心理月刊,2019,0(17):127-127.
5刘朋朋.车载导航定位定向系统研究概述[J].中国机械,2019,0(2):40-41.
6张静,马小英.精细化管理在神经内科护理管理中的应用[J].心理月刊,2019,0(15):147-147. 被引量：2
7储慧敏,杨会成,张丽,潘玥.基于全卷积神经网络的多尺度人脸检测[J].平顶山学院学报,2019,34(5):48-53. 被引量：1
8孟明辉.基于OpenCV机器视觉库的人脸检测、美颜的研究与实现[J].科学与信息化,2019,0(27):43-43.
9杨志斌,林磊.公路桥梁工程测量技术与测绘技术的应用[J].人民交通,2019,0(6):90-91. 被引量：3
10李旻择,李小霞,王学渊,孙维.基于多尺度核特征卷积神经网络的实时人脸表情识别[J].计算机应用,2019,39(9):2568-2574. 被引量：18

中国图象图形学报

2019年第10期

浏览历史

内容加载中请稍等...

FastFace:实时鲁棒的人脸检测算法被引量：9

参考文献3

二级参考文献11

共引文献38

同被引文献80

引证文献9

二级引证文献86

相关作者

相关机构

相关主题

浏览历史

FastFace:实时鲁棒的人脸检测算法 被引量：9

参考文献3

二级参考文献11

共引文献38

同被引文献80

引证文献9

二级引证文献86

相关作者

相关机构

相关主题

浏览历史

FastFace:实时鲁棒的人脸检测算法被引量：9