基于度量学习的跨模态人脸检索算法

A Cross-Modal Face Retrieval Algorithm Based on Metric Learning

下载PDF

导出

摘要现有基于度量学习的跨模态检索算法用于跨模态人脸检索任务时缺乏对视角差异和域差异的关注,并且在度量学习过程中缺乏对全局信息的学习,构建了大量的冗余三元组。为此,文中提出了一种基于度量学习的跨模态共同表达生成算法,采用偏航角等变模块补偿偏航角差异以获取具有鲁棒性的图像特征,使用多层注意力机制获取具有可分性的视频特征;使用全局三元组和局部三元组共同训练跨模态共同表达生成网络,以提升度量学习的一致性和准确性,同时通过半困难三元组筛选来加速损失函数的收敛;提出了结合域校准和迁移学习的域适应算法,以提升共同表达的泛化性。在PB、YTC和UMD人脸视频数据集上的实验结果表明,文中算法有效地提升了跨模态人脸检索的准确性,通过少数样本微调跨模态共同表达生成网络,可有效提升目标域图像跨模态检索的准确性。 The existing cross-modal retrieval algorithms based on metric learning ignore the pose differences and domain differences in cross-modal face retrieval tasks.In addition,these algorithms lack learning of global information in the process of metric learning and construct a large number of redundant triplets.Therefore,a cross-modal common representation generation algorithm based on metric learning was proposed in this paper.The algorithm uses the yaw angle equivariant module compensating for yaw angle differences to obtain the image features with robustness,uses the multi-layer attention mechanism to obtain video features with differentiability,uses global triplets and local triplets to jointly train the cross-modal common representation generation network,so as to improve the consistency and accuracy of metric learning.Then it accelerates the convergence of loss functions through the screening of semi-hard triplets.This study proposed a domain adaption algorithm which combines domain calibration and transfer learning to improve the generalization of common representations.The results of comparative experiments on three face video datasets,namely,PB,YTC,and UMD,demonstrate that the algorithm can improve the accuracy of cross-modal face retrieval,and fine-tuning the cross-modal common representation generation network with few samples can improve the accuracy of cross-modal retrieval using target domain images.

作者沃焱梁籍云韩国强 WO Yan;LIANG Jiyun;HAN Guoqiang(School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China)

机构地区华南理工大学计算机科学与工程学院

出处《华南理工大学学报（自然科学版）》 EI CAS CSCD 北大核心 2022年第6期1-9,共9页 Journal of South China University of Technology(Natural Science Edition)

基金广东省自然科学基金资助项目(2021A1515012020) 广州市科技计划项目(202002030298)。

关键词度量学习跨模态检索注意力机制深度学习 metric learning cross-modal retrieval attention mechanism deep learning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1杨杨,李晓琴,韩振波,付继鹏,高斌.基于三维多视角挤压激励卷积神经网络的肺结节良恶性分类研究[J].生物医学工程学杂志,2022,39(3):452-461. 被引量：3
2殷业瑜,高家全,李莹.面向印花图案检索的特征融合方法研究[J].南京师大学报（自然科学版）,2022,45(2):118-125. 被引量：1
3聂为之,王岩,杨嵩,刘安安,张勇东.基于循环生成对抗网络的跨媒体信息检索算法[J].计算机学报,2022,45(7):1529-1538. 被引量：7
4吴子锐,杨之蒙,蒲晓蓉,徐杰,曹晟,任亚洲.面向特征生成的无监督域适应算法[J].电子科技大学学报,2022,51(4):580-585.
5杨宇环,张开生.基于特征聚类的文本信息检索算法研究[J].陕西科技大学学报,2022,40(4):178-182. 被引量：1
6Xiao-Peng Song,Chengquan Huang,Min Feng,Joseph O.Sexton,Saurabh Channan,John R.Townshend.Integrating global land cover products for improved forest cover characterization: an application in North America[J].International Journal of Digital Earth,2014,7(9):709-724. 被引量：1
7赵小强,蒋红梅.基于特征和类别对齐的领域适应算法[J].控制与决策,2022,37(5):1203-1210. 被引量：1
8霍福临,熊祝佩,陈丽琴,聂小军,宋卫宁.基于RNA-seq技术分析野生大麦穗部发育基因的表达谱及其转录动态[J].麦类作物学报,2022,42(6):649-658.
9王金冬,马亚林,毛彤瑶,段招军.表达诺如病毒衣壳蛋白的重组腺病毒疫苗的构建及其免疫原性分析[J].中国生物制品学杂志,2022,35(3):257-262. 被引量：3
10王爱华,马红叶,罗克明,文晓鹏.火龙果响应PEG模拟干旱胁迫的转录组分析[J].果树学报,2022,39(7):1167-1182. 被引量：1

华南理工大学学报（自然科学版）

2022年第6期

浏览历史

内容加载中请稍等...

基于度量学习的跨模态人脸检索算法

相关作者

相关机构

相关主题

浏览历史