期刊文献+

融合Transformer和CNN的轻量级人脸识别算法 被引量:1

Lightweight Face Recognition Algorithm Combining Transformer and CNN
下载PDF
导出
摘要 随着深度学习的发展,卷积神经网络通过堆叠卷积层逐步扩大感受野以融合局部特征的方式已经成为人脸识别(FR)的主流方法,但这种方法存在因忽略人脸全局语义信息和缺乏对人脸重点特征信息的关注造成识别准确率不高,以及大参数量层数的堆叠导致网络难以部署于资源受限设备的问题。因此提出一种融合Transformer和CNN的极其轻量级FR算法gcsamTfaceNet。使用深度可分离卷积构建主干网络以降低算法的参数量;引入通道-空间注意力机制,从通道和空间两个域最优化选择特征以提高对人脸重点区域的关注度;在此基础上,融合Transformer模块以捕获特征图的全局语义信息,克服卷积神经网络在长距离语义依赖性建模方面的局限性,提高算法的全局特征感知能力。参数量仅为6.5×10^(5)的gcsamTfaceNet在9个验证集(LFW、CA-LFW、CP-LFW、CFP-FP、CFP-FF、AgeDB-30、VGG2-FP、IJB-B以及IJB-C)上实验评估,分别取得99.67%、95.60%、89.32%、93.67%、99.65%、96.35%、93.36%、89.43%和91.38%的平均准确率,达到参数量和性能之间较好的权衡。 With the development of deep learning,convolutional neural networks have become the mainstream approach for face recognition(FR)by gradually expanding the receptive field through stacking convolutional layers to integrate local features.However,this approach suffers from the drawbacks of neglecting global semantic information of faces and lacking attention to important facial features,resulting in low recognition accuracy.Additionally,the stacking of a large number of parameters and layers poses challenges for deploying the network on resource-constrained devices.Therefore,a highly lightweight face recognition algorithm called gcsamTfaceNet is proposed,which combines Transformer and CNN.Firstly,a depthwise separable convolution is used to construct the backbone network in order to reduce the parameter count of the algorithm.Secondly,a channel-spatial attention mechanism is introduced to optimize the selection of features in both the channel and spatial domains,thereby improving the attention given to important facial regions.Building upon this,the Transformer module is integrated to capture the global semantic information of the feature maps,overcoming the limitations of convolutional neural networks in modeling long-range semantic dependencies and enhancing the algorithm’s ability to perceive global features.The gcsamTfaceNet,with a parameter count of only 6.5×10^(5),is evaluated on nine validation datasets including LFW,CA-LFW,CP-LFW,CFP-FP,CFP-FF,AgeDB-30,VGG2-FP,IJB-B,and IJB-C.It achieves average accuracies of 99.67%,95.60%,89.32%,93.67%,99.65%,96.35%,93.36%,89.43%,and 91.38%on these datasets,respectively.This demonstrates a good balance between parameter count and performance.
作者 李明 党青霞 LI Ming;DANG Qingxia(Engineering Research Center of Hubei Province for Clothing Information,Wuhan Textile University,Wuhan 430200,China;Hubei Key Laboratory of Digital Textile Equipment,Wuhan Textile University,Wuhan 430200,China)
出处 《计算机工程与应用》 CSCD 北大核心 2024年第14期96-104,共9页 Computer Engineering and Applications
基金 湖北省数字化纺织装备重点实验室开放基金(DTL2018021) 湖北省服装信息化工程技术研究中心开放基金(184084004)。
关键词 轻量级人脸识别 卷积神经网络 TRANSFORMER 注意力机制 lightweight face recognition convolutional neural network Transformer attention mechanism
  • 相关文献

参考文献3

二级参考文献6

共引文献19

同被引文献13

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部