摘要
高精度的人脸关键点定位的神经网络模型受到硬件运算能力与存储空间等计算资源的限制,无法应用到许多嵌入式设备以及移动终端中。为了降低网络模型的运算规模,基于深度可分离卷积结构,提出一种轻量级的人脸关键点定位算法。该算法在双向金字塔特征融合的基础上,增加了跨层级的特征融合路径,并对跨层级的特征进行带权融合,以充分利用backbone网络提取的有限特征。该网络模型只有10.1 MB,且在单个的NVIDIA RTX 2070 SUPER的GPU上运行,每个图像推断时间为0.147 s,806.61 M浮点运算次数。此外,模型的参数数量为3.84 M,在300-W的公共测试集中取得了5.08%的normal mean error与0.12%的failure rate。实验数据表明,与传统方法相比,该算法在运算规模上大幅减小,可以移植到嵌入式设备中进行人脸关键点识别。
Owing to the limitation of computing resources such as hardware computing power and storage space,the neural network model of high-precision facial landmark location cannot be applied to many embedded devices and mobile terminals.In order to reduce the calculation scale of the network model,this paper proposed a lightweight facial landmark detection algorithm associated with depth separable convolution structure.Based on the bidirectional pyramid feature fusion,the algorithm added a cross level feature fusion path,and fused the cross level features with weights to make full use of the limited features extracted by the backbone network.The size of the model could be merely 10.1 MB and had inference time of 0.147 s and 806.61M FLOPs per image on a single NVIDIA RTX 2070 SUPER.In addition,this model achieved 5.08%normal mean error and 0.12%failure rate in 300-W public test set with 3.84M parameters.The experimental data shows that compared with the traditional methods,the algorithm is greatly reduced in the scale of calculation,and can be transplanted to the embedded device for facial landmark detection.
作者
李洋洋
江聪世
Li Yangyang;Jiang Congshi(School of Remote Sensing&Information Engineering,Wuhan University,Wuhan 430079,China)
出处
《计算机应用研究》
CSCD
北大核心
2021年第6期1874-1878,共5页
Application Research of Computers
关键词
特征金字塔
人脸关键点
特征融合
轻量级算法
spatial pyramid network
facial landmark
feature fusion
lightweight algorithm