Multiple object tracking(MOT)in unmanned aerial vehicle(UAV)videos has attracted attention.Because of the observation perspectives of UAV,the object scale changes dramatically and is relatively small.Besides,most MOT ...Multiple object tracking(MOT)in unmanned aerial vehicle(UAV)videos has attracted attention.Because of the observation perspectives of UAV,the object scale changes dramatically and is relatively small.Besides,most MOT algorithms in UAV videos cannot achieve real-time due to the tracking-by-detection paradigm.We propose a feature-aligned attention network(FAANet).It mainly consists of a channel and spatial attention module and a feature-aligned aggregation module.We also improve the real-time performance using the joint-detection-embedding paradigm and structural re-parameterization technique.We validate the effectiveness with extensive experiments on UAV detection and tracking benchmark,achieving new state-of-the-art 44.0 MOTA,64.6 IDF1 with 38.24 frames per second running speed on a single 1080Ti graphics processing unit.展开更多
基金This work was supported by National Program on Key Basic Research Project(No.2014CB744903)National Natural Science Foundation of China(Nos.61673270 and 61973212)Key Technology Research Program of Sichuan Provincial Department of Science and Technology(No.2020YFSY0027).
文摘Multiple object tracking(MOT)in unmanned aerial vehicle(UAV)videos has attracted attention.Because of the observation perspectives of UAV,the object scale changes dramatically and is relatively small.Besides,most MOT algorithms in UAV videos cannot achieve real-time due to the tracking-by-detection paradigm.We propose a feature-aligned attention network(FAANet).It mainly consists of a channel and spatial attention module and a feature-aligned aggregation module.We also improve the real-time performance using the joint-detection-embedding paradigm and structural re-parameterization technique.We validate the effectiveness with extensive experiments on UAV detection and tracking benchmark,achieving new state-of-the-art 44.0 MOTA,64.6 IDF1 with 38.24 frames per second running speed on a single 1080Ti graphics processing unit.