Object detection is widely used in object tracking;anchor-free object tracking provides an end-to-end single-object-tracking approach.In this study,we propose a new anchor-free network,the Siamese center-prediction ne...Object detection is widely used in object tracking;anchor-free object tracking provides an end-to-end single-object-tracking approach.In this study,we propose a new anchor-free network,the Siamese center-prediction network(SiamCPN).Given the presence of referenced object features in the initial frame,we directly predict the center point and size of the object in subsequent frames in a Siamese-structure network without the need for perframe post-processing operations.Unlike other anchor-free tracking approaches that are based on semantic segmentation and achieve anchor-free tracking by pixel-level prediction,SiamCPN directly obtains all information required for tracking,greatly simplifying the model.A center-prediction sub-network is applied to multiple stages of the backbone to adaptively learn from the experience of different branches of the Siamese net.The model can accurately predict object location,implement appropriate corrections,and regress the size of the target bounding box.Compared to other leading Siamese networks,SiamCPN is simpler,faster,and more efficient as it uses fewer hyperparameters.Experiments demonstrate that our method outperforms other leading Siamese networks on GOT-10K and UAV123 benchmarks,and is comparable to other excellent trackers on LaSOT,VOT2016,and OTB-100 while improving inference speed 1.5 to 2 times.展开更多
基金supported by the National Key R&D Program of China(Grant No.2018YFC0807500)the National Natural Science Foundation of China(Grant Nos.U20B2070 and 61832016).
文摘Object detection is widely used in object tracking;anchor-free object tracking provides an end-to-end single-object-tracking approach.In this study,we propose a new anchor-free network,the Siamese center-prediction network(SiamCPN).Given the presence of referenced object features in the initial frame,we directly predict the center point and size of the object in subsequent frames in a Siamese-structure network without the need for perframe post-processing operations.Unlike other anchor-free tracking approaches that are based on semantic segmentation and achieve anchor-free tracking by pixel-level prediction,SiamCPN directly obtains all information required for tracking,greatly simplifying the model.A center-prediction sub-network is applied to multiple stages of the backbone to adaptively learn from the experience of different branches of the Siamese net.The model can accurately predict object location,implement appropriate corrections,and regress the size of the target bounding box.Compared to other leading Siamese networks,SiamCPN is simpler,faster,and more efficient as it uses fewer hyperparameters.Experiments demonstrate that our method outperforms other leading Siamese networks on GOT-10K and UAV123 benchmarks,and is comparable to other excellent trackers on LaSOT,VOT2016,and OTB-100 while improving inference speed 1.5 to 2 times.