摘要
行人搜索是一个同时处理行人检测与行人重识别的联合任务.然而,行人检测与行人重识别之间存在任务冲突:行人检测旨在区分人与背景区域,关注行人的共性;行人重识别旨在辨别不同人,关注行人的特性.针对此任务冲突,与以往堆叠多个卷积层的深度级解耦方式不同,基于空间分离的思想,提出了一种简单高效的空间级解耦策略.该策略为两个任务设计不同的可形变卷积,自适应地在不同位置上分别提取行人检测特征与行人重识别特征,实现了行人共性与特性的分离.进一步,为了利用丰富的上下文信息帮助更好地辨别不同的行人,提出了一种上下文增强特征提取模块.该模块使用全局感知的多头注意力网络生成信息互补的多级特征,然后利用所设计的基于自注意力机制的多级特征融合模块,融合得到上下文增强特征.在该上下文增强特征的基础上,应用上述空间级解耦策略对其不同空间位置进行采样,解耦行人检测和行人重识别两个任务.实验结果表明,所提方法在CUHK-SYSU测试集上mAP和top-1准确率分别达到了94.2%和94.6%,在PRW测试集上mAP和top-1准确率分别达到了52.6%和87.6%,能够有效地提升行人搜索任务性能.
Person search is a joint task that simultaneously performs pedestrian detection and person reidentification;however,these two sub-tasks are not similar.Pedestrian detection aims to differentiate persons from background regions by focusing on the commonness of pedestrians,while person re-identification aims to distinguish different persons by emphasizing the uniqueness of each pedestrian.To address this task contradiction,a simple and efficient spatial-level decoupling strategy was proposed,as opposed to the existing depth-level decoupling methods of stacking multiple convolutional layers.Two different deformable convolutions were endorsed to adaptively extract features at different positions for the two sub-tasks,allowing the separation of pedestrian commonness and uniqueness.Furthermore,a context-enhanced feature extraction module was also presented to exploit rich contextual information for better person identification.A multi-head attention network capable of capturing long-range dependencies was used to generate multi-level features with complementary information.Moreover,a multi-level feature fusion module based on a self-attention mechanism was proposed to obtain the context-enhanced features.The above spatiallevel decoupling strategy was applied to the context-enhanced feature for sampling features at different spatial positions,thereby decoupling the pedestrian detection task and person re-identification task.Experimental results show that the mean average precision(mAP)and top-1 accuracy of the proposed method are 94.2%and 94.6%on the CUHK-SYSU test set,respectively.For the PRW test set,the mAP and top-1 accuracy are 52.6%and 87.6%,respectively.Those results indicate that the proposed method can significantly improve person search.
作者
庞彦伟
王佳蓓
Pang Yanwei;Wang Jiabei(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China;Tianjin Key Laboratory of Brain-Inspired Intelligence Technology,Tianjin University,Tianjin 300072,China)
出处
《天津大学学报(自然科学与工程技术版)》
EI
CAS
CSCD
北大核心
2023年第12期1307-1316,共10页
Journal of Tianjin University:Science and Technology
基金
天津市科技计划资助项目(19ZXZNGX00050)。
关键词
行人搜索
行人检测
行人重识别
形变卷积
上下文增强
person search
pedestrian detection
person re-identification
deformable convolution
contextual enhancement