Small-object detection has long been a challenge.High-megapixel cameras are used to solve this problem in industries.However,current detectors are inefficient for high-resolution images.In this work,we propose a new m...Small-object detection has long been a challenge.High-megapixel cameras are used to solve this problem in industries.However,current detectors are inefficient for high-resolution images.In this work,we propose a new module called Pre-Locate Net,which is a plug-and-play structure that can be combined with most popular detectors.We inspire the use of classification ideas to obtain candidate regions in images,greatly reducing the amount of calculation,and thus achieving rapid detection in high-resolution images.Pre-Locate Net mainly includes two parts,candidate region classification and behavior classification.Candidate region classification is used to obtain a candidate region,and behavior classification is used to estimate the scale of an object.Different follow-up processing is adopted according to different scales to balance the variance of the network input.Different from the popular candidate region generation method,we abandon the idea of regression of a bounding box and adopt the concept of classification,so as to realize the prediction of a candidate region in the shallow network.We build a high-resolution dataset of aircraft and landing gears covering complex scenes to verify the effectiveness of our method.Compared to state-of-the-art detectors(e.g.,Guided Anchoring,Libra-RCNN,and FASF),our method achieves the best m AP of 94.5 on 1920×1080 images at 16.7 FPS.展开更多
基金the National Science Fund for Distinguished Young Scholars of China (No. 51625501)the Aeronautical Science Foundation of China (No. 201946051002)
文摘Small-object detection has long been a challenge.High-megapixel cameras are used to solve this problem in industries.However,current detectors are inefficient for high-resolution images.In this work,we propose a new module called Pre-Locate Net,which is a plug-and-play structure that can be combined with most popular detectors.We inspire the use of classification ideas to obtain candidate regions in images,greatly reducing the amount of calculation,and thus achieving rapid detection in high-resolution images.Pre-Locate Net mainly includes two parts,candidate region classification and behavior classification.Candidate region classification is used to obtain a candidate region,and behavior classification is used to estimate the scale of an object.Different follow-up processing is adopted according to different scales to balance the variance of the network input.Different from the popular candidate region generation method,we abandon the idea of regression of a bounding box and adopt the concept of classification,so as to realize the prediction of a candidate region in the shallow network.We build a high-resolution dataset of aircraft and landing gears covering complex scenes to verify the effectiveness of our method.Compared to state-of-the-art detectors(e.g.,Guided Anchoring,Libra-RCNN,and FASF),our method achieves the best m AP of 94.5 on 1920×1080 images at 16.7 FPS.