摘要
实例级草图图像检索旨在使用草图检索图像.草图与真实图像之间存在模态差异大和特征不对齐问题,现有方法不能有效减小草图和图像之间模态差异,并且只在单个粒度上获取信息,无法有效进行特征对齐.因此,文中提出双分支多粒度局部对齐网络(Two Stream Multi-granularity Local Alignment Network,TSMLA),引入双分支特征提取器,提取模态共享和模态特异的局部特征,同时利用这两种特征计算草图和真实图像间的距离,减少不同模态间的差异.同时,提出多粒度局部对齐模块,对距离矩阵进行不同粒度的池化操作,在不同尺度上对齐局部特征,进一步解决特征不对齐问题.TSMLA能够充分利用草图和真实图像的信息,同时有效利用不同粒度特征间的联系.在多个数据集上的实验验证TSMLA的有效性.
The goal of instance-level sketch-based image retrieval is to retrieve images by sketches.There is a significant modality gap and feature misalignment issue between sketches and images.In the existing methods,the modality gap between sketches and images cannot be effectively reduced,and only information at a single granularity is captured.Thus,features cannot be aligned effectively.To address these issues,a two stream multi-granularity local alignment network(TSMLA)is proposed.A two-stream feature extractor is introduced to extract both modality-shared and modality-specific local features.These features are simultaneously utilized to calculate the distance between the sketch and the image and reduce the differences between different modalities.Moreover,a multi-granularity local alignment module is adopted to pool the distance matrix at various granularities.Local features are aligned at different scales to effectively address the problem of feature misalignment.TSMLA can fully utilize the information of sketches and real images,while effectively utilizing the connections between features of different granularities.Experiments on multiple datasets validate the effectiveness of TSMLA.
作者
韩雪昆
苗夺谦
张红云
张齐贤
HAN Xuekun;MIAO Duoqian;ZHANG Hongyun;ZHANG Qixian(College of Electronic and Information Engineering,Tongji University,Shanghai 201804;Key Laboratory of Embedded System and Service Computing,Ministry of Education,Tongji University,Shanghai 201804)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2023年第8期701-711,共11页
Pattern Recognition and Artificial Intelligence
基金
国家重点研发计划项目(No.2022YFB3104700)
国家自然科学基金项目(No.61976158,61976160,62076182)资助。
关键词
草图图像检索
特征提取
特征融合
跨模态检索
Sketch-Based Image Retrieval
Feature Extraction
Feature Fusion
Cross-Modal Retrieval