摘要
网络图像的文本和图像之间有较强的相关性,传统基于内容的图像检索方法往往忽视文本和图像的相关性,而跨模态检索中,文本和图像的底层特征独立获得,并未有效利用两模态之间的语义关联性,基于此,本文提出了一种跨模态语义增强的图像检索方法(CSR),协同约束文本底层特征的线性判别分析项及两模态的典型相关分析项,使得文本语义增强的同时其强语义性通过协同约束迁移到图像特征中,最后通过多类逻辑回归获得文本和图像语义特征,用文本语义特征正则化图像语义特征,进一步提高图像特征的语义判别性。在Wikipedia和Pascal Sentence数据集上进行实验,显示本文方法能有效提高图像检索的平均查准率。
Images on the network have a strong correlation between the text and the image.Traditional content-based image retrieval methods always ignore the correlation of pairs of images and text.In cross-modal retrieval,the underlying features of text and image are obtained independently,so that the semantic correlation between two modes is not utilized effectively.So,a cross-modal semantic enhancement method for image retrieval(CSR)is proposed.The linear discriminant analysis term of the underlying features of the text and the canonical correlation analysis are optimized jointly,so that the strong semantic of the text was transferred to the image features by Joint optimization while the text semantic was enhanced.Finally,the semantic features of text and image are obtained through multinomial logistic regression,and the semantic features of image are regularized with the semantic features of text.This method approach to image retrieval on Wikipedia and Pascal Sentence data sets is shown to produce substantial gains in image retrieval precision.
作者
王琪
王睿
王力
WANG Qi;WANG Rui;WANG Li(School of Information Engineering,Nanyang Institute of Technology,Nanyang 473004,China;Lucky Huaguang Graphics Co LTD,Nanyang 473004,China;School of Civil Engineering,Nanyang Institute of Technology,Nanyang 473004,China)
出处
《南阳理工学院学报》
2021年第2期53-58,共6页
Journal of Nanyang Institute of Technology
基金
河南省高等学校重点科研项目(19B520017,20A520030)
河南省科技厅科技攻关项目(202102310199)。
关键词
跨模态
协同约束
语义增强
图像检索
cross-modal
collaborative constraint
semantic enhancement
image retrieval