摘要
提出融合CNN与Transformer的跨年龄人脸识别模型,模型先使用深度可分离T2T-ViT网络提取丰富的面部特征,然后利用多尺度注意力分解模块非线性地解耦年龄和身份特征,最后通过互信息最小化、交叉熵和Arcface函数共同约束特征分解。模型在3个基准数据集FG-NET、CACD_VS、CALFW上的准确率分别达到94.97%、99.51%、95.81%,接近或超越SOTA性能,表明所提模型能够提取健全的面部信息并可进行高效特征解耦,实现较为先进的识别性能。
A novel cross-age face recognition model is proposed in the paper,which integrates CNN and Transformer architectures into it.In the model,the full information of facial features is extracted by using the deep separable T2T-ViT network;and then,the age and identity features are nonlinearly separated by using a multi-scale attention decomposition module;finally,the feature decomposition is constrained through mutual information minimization,cross-entropy,and the Arcface function.By the proposed model,we obtain impressive accuracy rates of 94.97%,99.51%and 95.81%,approaching to or even surpassing the performance of state-of-the-art(SOTA)on three benchmark datasets,FG-NET,CACD_VS and CALFW,respectively,indicating that the proposed model is able to comprehensively extract facial information and effectively separate features,thus leading to advanced recognition performance.
作者
刘二毛
智敏
LIU Ermao;ZHI Min(College of Computer Science and Technology,Inner Mongolia Normal University,Hohhot 010022,China)
出处
《内蒙古师范大学学报(自然科学汉文版)》
CAS
2024年第1期53-60,共8页
Journal of Inner Mongolia Normal University(Natural Science Edition)
基金
内蒙古自治区自然科学基金资助项目“基于正交视频Transformer的跨年龄羊脸识别”(2023MS06009),“基于卷积神经网络的人体行为识别研究”(2018MS06008)
内蒙古自治区高等学校科学研究资助项目“基于人-物关联的人体动作识别研究”(NJZZ21004)。