摘要
为进一步提升跨模态检索的性能,提出融合多级相似度信息的跨模态哈希检索方法.首先,利用自注意力的方法增强文本特征,并基于不同模态的原始特征和哈希特征构造新的融合特征;然后,在这3种特征的基础上,构造出3个辅助相似度矩阵,并采用加权组合的方法构造出第4个辅助相似度矩阵;最后,通过这4个不同的矩阵分别计算不同相似度矩阵之间和不同模态之间的损失函数.这4个不同的矩阵既包括不同的特征形式,也包括不同的矩阵构造方式,因而能更好地表达不同模态的相似度信息,并提升检索性能.在Wikipedia,MIRFlickr和NUS-WIDE 3个基准数据集上的实验结果表明,所提方法在不同码位的mAP值优于许多当前国际先进的方法,具有良好的有效性和鲁棒性.
In order to further improve the performance of cross-modal retrieval,a cross-modal hash retrieval method integrating multi-level similarity information is proposed.First,self-attention method is used to enhance the text features,and a new fusion feature is constructed based on the original features and hash features of different modalities.Then,based on these three features,three auxiliary similarity matrices are constructed,and the fourth auxiliary similarity matrix is constructed by a weighted combination method.Finally,these four different matrices are used to calculate the loss functions between different similarity matrices and between different modalities.Since the four matrices include different feature forms and different matrix construction methods,they can better express similarity information of different modalities and improve the retrieval performance.The experiments are conducted on three benchmark datasets of Wikipedia,MIRFlickr and NUS-WIDE.The results show that the mAP values at different code bits of proposed method is better than that of many state-of-the-art methods,which verifies the effectiveness and robustness of our method.
作者
李志欣
侯传文
谢秀敏
Li Zhixin;Hou Chuanwen;Xie Xiumin(Guangxi Key Lab of Multi-source Information Mining and Security,Guangxi Normal University,Guilin 541004)
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2022年第6期933-945,共13页
Journal of Computer-Aided Design & Computer Graphics
基金
国家自然科学基金(61966004,61866004)
广西自然科学基金(2019GXNSFDA245018).
关键词
跨模态检索
多重相似度矩阵
无监督学习
卷积神经网络
自注意力机制
cross-modal retrieval
multiple similarity matrices
unsupervised learning
convolutional neural network
self-attention mechanism