期刊文献+

基于度量学习的跨模态人脸检索算法

A Cross-Modal Face Retrieval Algorithm Based on Metric Learning
下载PDF
导出
摘要 现有基于度量学习的跨模态检索算法用于跨模态人脸检索任务时缺乏对视角差异和域差异的关注,并且在度量学习过程中缺乏对全局信息的学习,构建了大量的冗余三元组。为此,文中提出了一种基于度量学习的跨模态共同表达生成算法,采用偏航角等变模块补偿偏航角差异以获取具有鲁棒性的图像特征,使用多层注意力机制获取具有可分性的视频特征;使用全局三元组和局部三元组共同训练跨模态共同表达生成网络,以提升度量学习的一致性和准确性,同时通过半困难三元组筛选来加速损失函数的收敛;提出了结合域校准和迁移学习的域适应算法,以提升共同表达的泛化性。在PB、YTC和UMD人脸视频数据集上的实验结果表明,文中算法有效地提升了跨模态人脸检索的准确性,通过少数样本微调跨模态共同表达生成网络,可有效提升目标域图像跨模态检索的准确性。 The existing cross-modal retrieval algorithms based on metric learning ignore the pose differences and domain differences in cross-modal face retrieval tasks.In addition,these algorithms lack learning of global information in the process of metric learning and construct a large number of redundant triplets.Therefore,a cross-modal common representation generation algorithm based on metric learning was proposed in this paper.The algorithm uses the yaw angle equivariant module compensating for yaw angle differences to obtain the image features with robustness,uses the multi-layer attention mechanism to obtain video features with differentiability,uses global triplets and local triplets to jointly train the cross-modal common representation generation network,so as to improve the consistency and accuracy of metric learning.Then it accelerates the convergence of loss functions through the screening of semi-hard triplets.This study proposed a domain adaption algorithm which combines domain calibration and transfer learning to improve the generalization of common representations.The results of comparative experiments on three face video datasets,namely,PB,YTC,and UMD,demonstrate that the algorithm can improve the accuracy of cross-modal face retrieval,and fine-tuning the cross-modal common representation generation network with few samples can improve the accuracy of cross-modal retrieval using target domain images.
作者 沃焱 梁籍云 韩国强 WO Yan;LIANG Jiyun;HAN Guoqiang(School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China)
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2022年第6期1-9,共9页 Journal of South China University of Technology(Natural Science Edition)
基金 广东省自然科学基金资助项目(2021A1515012020) 广州市科技计划项目(202002030298)。
关键词 度量学习 跨模态检索 注意力机制 深度学习 metric learning cross-modal retrieval attention mechanism deep learning
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部