期刊文献+

基于特征对齐和高斯表征的视觉有向目标检测 被引量:2

Visual oriented object detection via feature alignment and Gaussian parameterization
原文传递
导出
摘要 有向目标检测是计算机视觉中的一个研究热点,在遥感、场景文字等领域具有广泛应用.大长宽比、密集排列以及任意方向等问题是该领域目标检测面临的主要挑战.本文提出了一种基于单阶段检测方法的级联有向检测器R3DetGauss,采用一种从粗到细的渐进式回归方法快速准确地定位目标.考虑到级联检测器中存在的特征不对齐的问题,本文设计了一个特征精修模块(feature refinement module,FRM),能够获得更准确的特征,从而提高检测性能.FRM通过逐像素特征插值将当前精修后的边界框的位置信息重新编码到对应的特征点,进而实现特征的重构和对齐.本文还采用了具有尺度不变性的归一化高斯Wasserstein距离作为回归损失来进一步提高估计边界框的质量.此外,本文基于该距离提出了长宽比感知的自适应样本采样策略,提高了样本分配的质量.在多个公开的图像数据集上的大量实验结果表明,所提出的R3DetGauss检测器在多种数据集上均能够进一步提升精度,并最终达到当前先进检测水平.相关代码在国产深度学习Jittor框架、PyTorch和TensorFlow中均进行了开源发布. Oriented object detection is a research hotspot in computer vision,and has a wide range of applications in remote sensing,scene text,etc.The problems of large aspect ratio,dense arrangement,and arbitrary orientation are the current main challenges in this eld.The authors present a re ned oriented detector,R3DetGauss,based on a single-stage detection method,which employs a coarse-to-ne progressive regression manner to locate objects quickly and accurately.Considering the issue of feature misalignment in re ned detectors,this paper designs a feature re nement module(FRM)to obtain more accurate features to improve the detection performance.Speci cally,FRM re-encodes the position information of the currently re ned bounding box to the corresponding feature points through pixel-wise feature interpolation,thereby realizing feature reconstruction and alignment.This paper also designs a scale-invariant normalized Gaussian Wasserstein distance as the regression loss to further improve the quality of the predicted bounding boxes.In addition,this paper proposes an aspect ratio-aware adaptive sampling strategy based on this distance,which improves the quality of sample allocation.A large number of quantitative and qualitative experimental results show that the devised R3DetGauss can improve existing baseline,and achieve state-of-the-art detection accuracy on a variety of datasets.The models and codes are implemented and released by the domestic open-source deep learning framework Jittor,together with PyTorch and TensorFlow.
作者 杨学 严骏驰 Xue YANG;Junchi YAN(Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China;MoE Key Lab of Arti cial Intelligence,Shanghai Jiao Tong University,Shanghai 200240,China;Shanghai Arti cial Intelligence Laboratory,Shanghai 200030,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2023年第11期2250-2265,共16页 Scientia Sinica(Informationis)
基金 科技创新2030—“新一代人工智能”重大项目(批准号:2020AAA0107600) 国家自然科学基金优秀青年基金项目(批准号:62222607) 上海市级科技重大专项(批准号:2021SHZDZX0102)资助项目。
关键词 有向目标检测 计算机视觉 特征精修模块 分布距离 标签分配 回归损失 oriented object detection computer vision feature re nement module distribution distance label assignment regression loss
  • 相关文献

参考文献3

二级参考文献7

共引文献117

同被引文献18

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部