期刊文献+

基于分割的任意形状场景文本实时检测

Real-time Detection of Arbitrary Shape Scene Text Based on Segmentation
下载PDF
导出
摘要 当前场景文本检测技术面临的挑战主要体现在2个方面:模型实时性和准确性之间的权衡,以及任意形状文本的检测。它们决定了场景文本检测在真实场景中应用是否可行。针对以上2个问题,本文采用基于分割的方法,提出一种轻量且特征提取能力强的主干网络,可以实时准确地检测任意形状的自然场景文本。具体来说,使用了结构简单的双分辨率残差主干网络和低计算成本的深度聚合金字塔池化模块,将二者提取到的特征融合使用可微二值化模块进行分割。通过在标准英文数据集ICDAR2015上进行的对比实验表明,本文提出的改进方法有效,且在实时性和准确性上都达到可比较的结果。 The current challenges of scene text detection technology are mainly reflected in two aspects:the trade-off between model real-time performance and accuracy,and the detection of arbitrary shape text.They determine whether scene text detection is feasible in real scenes.Aiming at the above two problems,this paper proposes a lightweight backbone network with strong feature extraction ability based on segmentation method,which can accurately detect natural scene text of arbitrary shape in real time.Specifically,a simple dual-resolution residual backbone network and a deep aggregate pyramid pooling module with low computational cost are used,and the features extracted from them are fused and segmented using a differentiable binarization module.Through the comparative experiment on the standard English dataset ICDAR2015,the result show that the improved method proposed in this paper is effective,and achieves comparable results in real-time performance and accuracy.
作者 许鸿奎 李振业 郭文涛 赵京政 郭旭斌 XU Hong-kui;LI Zhen-ye;GUO Wen-tao;ZHAO Jing-zheng;GUO Xu-bin(School of Information and Electrical Engineering,Shandong Jianzhu University,Jinan 250101,China;Shandong Key Laboratory of Intelligent Buildings Technology,Jinan 250101,China)
出处 《计算机与现代化》 2023年第11期95-100,共6页 Computer and Modernization
基金 山东省重大科技创新工程项目(2019JZZY010120) 山东省重点研发计划项目(2019GSF111054)。
关键词 实时文本检测 双分辨率主干 语义分割 深度聚合金字塔池化模块 real-time text detection dual resolution backbone semantic segmentation deep aggregation pyramid pooling module
  • 相关文献

参考文献2

二级参考文献98

  • 1Tsai S S, Chen H, Chen D, Schroth G, Grzeszczuk R, Girod B. Mobile Yingying ZHU et al. Scene text detection and recognition: recent advances and future trends visual search on printed documents using text and low bit-rate features. In: Proceedings of the 18th IEEE International Conference on Image Processing. 2011, 2601-2604.
  • 2Barber D B, Redding J D, McLain T W, Beard R W, Taylor CN. Vision-based target geo-location using a fixed-wing miniature air vehi?cle. Journal of Intelligent and Robotic Systems, 2006, 47(4): 361-382.
  • 3Kisacanin B, Pavlovic V, Huang T S. Real-time vision for human?computer interaction. Springer Science and Business Media, 2005.
  • 4DeSouza G N, Kak A C. Vision for mobile robot navigation: a sur?vey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24(2): 237-267.
  • 5Ham Y K, Kang M S, Chung H K, Park R H, Park G T. Recognition of raised characters for automatic classification of rubber tires. Optical Engineering. 1995, 34(1): 102-109.
  • 6Yao C, Zhang X, Bai X, Liu W, Tu Z. Rotation-invariant features for multi-oriented text detection in natural images. PloS one, 2013, 8(8): e70173.
  • 7Yao C, Bai X, Shi B, Liu W. Strokelets: A learned multi-scale represen?tation for scene text recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 4042-4049.
  • 8Chen X, Yuille A L. Detecting and reading text in natural scenes. In: Proceedings of 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004, 2.
  • 9Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In: Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2963-2970.
  • 10Neumann L, Matas J. A method for text localization and recognition in real-world images. Lecture Notes in Computer Science, 2011, 6494, 770-783.

共引文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部