基于分割的任意形状场景文本实时检测

Real-time Detection of Arbitrary Shape Scene Text Based on Segmentation

下载PDF

导出

摘要当前场景文本检测技术面临的挑战主要体现在2个方面:模型实时性和准确性之间的权衡,以及任意形状文本的检测。它们决定了场景文本检测在真实场景中应用是否可行。针对以上2个问题,本文采用基于分割的方法,提出一种轻量且特征提取能力强的主干网络,可以实时准确地检测任意形状的自然场景文本。具体来说,使用了结构简单的双分辨率残差主干网络和低计算成本的深度聚合金字塔池化模块,将二者提取到的特征融合使用可微二值化模块进行分割。通过在标准英文数据集ICDAR2015上进行的对比实验表明,本文提出的改进方法有效,且在实时性和准确性上都达到可比较的结果。 The current challenges of scene text detection technology are mainly reflected in two aspects:the trade-off between model real-time performance and accuracy,and the detection of arbitrary shape text.They determine whether scene text detection is feasible in real scenes.Aiming at the above two problems,this paper proposes a lightweight backbone network with strong feature extraction ability based on segmentation method,which can accurately detect natural scene text of arbitrary shape in real time.Specifically,a simple dual-resolution residual backbone network and a deep aggregate pyramid pooling module with low computational cost are used,and the features extracted from them are fused and segmented using a differentiable binarization module.Through the comparative experiment on the standard English dataset ICDAR2015,the result show that the improved method proposed in this paper is effective,and achieves comparable results in real-time performance and accuracy.

作者许鸿奎李振业郭文涛赵京政郭旭斌 XU Hong-kui;LI Zhen-ye;GUO Wen-tao;ZHAO Jing-zheng;GUO Xu-bin(School of Information and Electrical Engineering,Shandong Jianzhu University,Jinan 250101,China;Shandong Key Laboratory of Intelligent Buildings Technology,Jinan 250101,China)

机构地区山东建筑大学信息与电气工程学院山东省智能建筑技术重点实验室

出处《计算机与现代化》 2023年第11期95-100,共6页 Computer and Modernization

基金山东省重大科技创新工程项目(2019JZZY010120) 山东省重点研发计划项目(2019GSF111054)。

关键词实时文本检测双分辨率主干语义分割深度聚合金字塔池化模块 real-time text detection dual resolution backbone semantic segmentation deep aggregation pyramid pooling module

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1Yingying ZHU,Cong YAO,Xiang BAI.Scene text detection and recognition： recent advances and future trends[J].Frontiers of Computer Science,2016,10(1):19-36. 被引量：22
2蔡鑫鑫,王敏.基于分割的任意形状场景文本检测[J].计算机系统应用,2020,29(12):257-262. 被引量：2

二级参考文献98

1Tsai S S, Chen H, Chen D, Schroth G, Grzeszczuk R, Girod B. Mobile Yingying ZHU et al. Scene text detection and recognition: recent advances and future trends visual search on printed documents using text and low bit-rate features. In: Proceedings of the 18th IEEE International Conference on Image Processing. 2011, 2601-2604.
2Barber D B, Redding J D, McLain T W, Beard R W, Taylor CN. Vision-based target geo-location using a fixed-wing miniature air vehi?cle. Journal of Intelligent and Robotic Systems, 2006, 47(4): 361-382.
3Kisacanin B, Pavlovic V, Huang T S. Real-time vision for human?computer interaction. Springer Science and Business Media, 2005.
4DeSouza G N, Kak A C. Vision for mobile robot navigation: a sur?vey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24(2): 237-267.
5Ham Y K, Kang M S, Chung H K, Park R H, Park G T. Recognition of raised characters for automatic classification of rubber tires. Optical Engineering. 1995, 34(1): 102-109.
6Yao C, Zhang X, Bai X, Liu W, Tu Z. Rotation-invariant features for multi-oriented text detection in natural images. PloS one, 2013, 8(8): e70173.
7Yao C, Bai X, Shi B, Liu W. Strokelets: A learned multi-scale represen?tation for scene text recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 4042-4049.
8Chen X, Yuille A L. Detecting and reading text in natural scenes. In: Proceedings of 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004, 2.
9Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In: Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2963-2970.
10Neumann L, Matas J. A method for text localization and recognition in real-world images. Lecture Notes in Computer Science, 2011, 6494, 770-783.

共引文献22

1王润民,桑农,丁丁,陈杰,叶齐祥,高常鑫,刘丽.自然场景图像中的文本检测综述[J].自动化学报,2018,44(12):2113-2141. 被引量：51
2张矿,朱远平.基于超像素融合的文本分割[J].计算机应用,2016,36(12):3418-3422. 被引量：2
3杨飞.自然场景图像中的文字检测综述[J].电子设计工程,2016,24(24):165-168. 被引量：12
4李翌昕,马尽文.文本检测算法的发展与挑战[J].信号处理,2017,33(4):558-571. 被引量：8
5Junge ZHANG,Kaiqi HUANG,Tieniu TAN,Zhaoxiang ZHANG.Local structured representation for generic object detection[J].Frontiers of Computer Science,2017,11(4):632-648. 被引量：1
6朱盈盈,张拯,章成全,张兆翔,白翔,刘文予.适用于文字检测的候选框提取算法[J].数据采集与处理,2017,32(6):1097-1106. 被引量：2
7白翔,杨明锟,石葆光,廖明辉.基于深度学习的场景文字检测与识别[J].中国科学：信息科学,2018,48(5):531-544. 被引量：35
8刘美华,傅彩明,梁开健,周细凤.应用MSER和局部二值化的网络图片文本定位[J].光电子．激光,2018,29(6):660-668. 被引量：2
9陈晓龙,陈显龙,袁建平,高宇豆,张加其.基于深度学习的电力设备铭牌识别[J].广西大学学报（自然科学版）,2018,43(6):2216-2226. 被引量：15
10陈硕,郑建彬,詹恩奇,汪阳.基于笔画角度变换和宽度特征的自然场景文本检测[J].计算机应用研究,2019,36(4):1270-1274. 被引量：4

1李丹,庄烨,方滨,林志福.线性化厌氧一号模型在干式厌氧系统中的应用[J].环境工程,2023,41(S02):579-586.
2丁明慧,解朋朋.张量TTr1SVD的随机算法[J].中国海洋大学学报（自然科学版）,2023,53(S01):190-198.
3张邵峰,魏金辉,唐新春,许家婧,刘心语,姚潞.低速冲击损伤后FML的剩余拉伸强度数值模拟研究[J].复合材料科学与工程,2023(11):28-36.
4李晓鹏,凌诚,高敬阳.基于混合路径HMC的分子树空间采样方法[J].计算机科学,2023,50(12):322-329.

计算机与现代化

2023年第11期

浏览历史

内容加载中请稍等...

基于分割的任意形状场景文本实时检测

参考文献2

二级参考文献98

共引文献22

相关作者

相关机构

相关主题

浏览历史