Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm...Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64 D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients(BING), can be used for efficient objectness estimation, which requires only a few atomic operations(e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING's localization performance, when used in multithresholding straddling expansion(MTSE) postprocessing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersectionover-union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image.展开更多
基金supported by the National Natural Science Foundation of China (Nos. 61572264, 61620106008)
文摘Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64 D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients(BING), can be used for efficient objectness estimation, which requires only a few atomic operations(e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING's localization performance, when used in multithresholding straddling expansion(MTSE) postprocessing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersectionover-union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image.