双分支跨级特征融合的自然场景文本检测

Natural scene text detection based on double-branchcross-level feature fusion

下载PDF

导出

摘要现有的场景文本检测方法在处理任意形状文本时,由于复杂背景的影响会造成文本区域定位不准确、相邻文本漏检误检的问题,基于此提出一种双分支跨级特征融合的自然场景文本检测方法。首先,以Resnet50为主干网络提取初始特征,设计跨级特征分布增强模块(cross-level feature distribution enhancement module,CFDEM),增强跨级特征文本信息的交互性,提高特征的表达能力;然后,为自适应地选择过滤非文本或冗余特征,降低误检率和漏检率,提出自适应融合策略(adaptive fusion strategy,AFS),利用双分支结构加强不同维度特征之间的联系,优化融合过程;最后,预测阶段采用可微分二值化的方法来生成文本检测结果。所提方法在ICDAR2015、ICDAR2017、Total-Text、CTW1500数据集上进行消融实验,实验结果表明该方法能准确定位文本区域,克服文本漏检误检影响。 Current scene text detection methods cause the inaccurate location of text regions and false detection of adja-cent texts due to the influence of complex backgrounds in arbitrarily shaped texts.To solve this issue,a natural scene text detection method based on double-branch cross-level feature fusion is proposed.First,the initial features were ex-tracted using Resnet50 as the backbone network,and then a cross-level feature distribution enhancement module was de-signed to improve the interaction of cross-level feature text information and the expression ability of features.Second,an adaptive fusion strategy was proposed to filter nontext or redundant features adaptively and reduce the false and missed detection rates using the double-branch structure to strengthen the relationship between different dimensional features and optimize the fusion process.Last,the differential binarization method was used to yield text detection res-ults in the prediction phase.The proposed method was employed to perform ablation experiments on the ICDAR2015,ICDAR2017,Total-Text,and CTW1500 datasets.The findings revealed that this method can accurately locate the text area and overcome the impact of text miss and false detections.

作者刘光辉张钰敏孟月波占华 LIU Guanghui;ZHANG Yumin;MENG Yuebo;ZHAN Hua(School of Information and Control Engineering,Xi’an University of Architecture and Technology,Xi’an 710055,China)

机构地区西安建筑科技大学信息与控制工程学院

出处《智能系统学报》 CSCD 北大核心 2023年第5期1079-1089,共11页 CAAI Transactions on Intelligent Systems

基金国家自然科学基金项目(52278125) 陕西省重点研发计划(2021SF-429)。

关键词文本检测任意形状跨级特征分布增强自适应融合双分支空间维度通道维度可微分二值化 text detection arbitrarily shaped cross-level feature distribution enhancement adaptive fusion double branch spatial dimension channel dimension differentiable binarization

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1黄剑华,唐降龙,刘家锋,徐莉莉.一种基于Homogeneity的文本检测新方法[J].智能系统学报,2007,2(1):69-73. 被引量：3
2吕国宁,高敏.视觉感知式场景文字检测定位方法[J].智能系统学报,2017,12(4):563-569. 被引量：2
3孟月波,石德旺,刘光辉,徐胜军,金丹.多维度卷积融合的密集不规则文本检测[J].光学精密工程,2021,29(9):2210-2221. 被引量：6
4赵文清,杨盼盼.双向特征融合与注意力机制结合的目标检测[J].智能系统学报,2021,16(6):1098-1105. 被引量：17

二级参考文献28

1刘涛,汪西莉.采用卷积核金字塔和空洞卷积的单阶段目标检测[J].中国图象图形学报,2020,0(1):102-112. 被引量：11
2[1]JEONG K Y,JUNG K,KIM E Y,et al:Neural network-based text location for news video indexing[J].IEEE Transactions on Information Theory,1998,44(5):319-323.
3[2]KIM K I,JUNG K,KIM J H.Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(12):1631-1639.
4[3]LI H P,DOERMANN D,KIA O.Automatic text detection and tracking in digital video[J].IEEE Transaction on Image Processing,2000,9(1):147-156.
5[4]CHEN X R,ZHANG H J.Text area detection from video frames[A].IEEE Pacific Rim Conference on Multimedia:Advances in Multimedia Information Processing[C].[s.1.],2001.
6[5]IAENHART R,WERNICKE A.Localizing and segmentation text in images and videos[J].IEEE Transactions On Circuits and Systems For Video Technology,2000,12(4):256-268.
7[6]YE Q X,HUANG Q M,GAO W,ZHAO D B.Fast and robust text detection in images and video frames[J].Image Vision and Computing,2005(23):565-576.
8[8]ZHONG Y,ZHANG Hongjiang,JAIN A K.Automatic caption location in compressed video[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(4):385-392.
9[9]LUCAS S M,PANARET0s A,SOSA L.ICDAR 2003 robust reading competition[A].In:IEEE Proceeding of The 7 th International Conference on Document Analysis and Recognition[C].[s.1.],2003.
10姜维,卢朝阳,李静,刘晓佩.基于角点类别特征和边缘幅值方向梯度直方图统计特征的复杂场景文字定位算法[J].吉林大学学报（工学版）,2013,43(1):250-255. 被引量：4

共引文献24

1仝卫国,仪小龙,李冰,杨珂.融合多尺度特征与注意力机制的风机桨叶缺陷检测方法[J].电子测量技术,2022,45(24):166-172. 被引量：2
2韩雪.支持安全编译算法的高可靠嵌入式C语言编辑器的设计[J].杭州师范大学学报（自然科学版）,2011,10(4):370-374.
3刘权,苏海,苗敏婧.基于Gabor-统计特征与SVM的文档图像文本检测方法[J].包装工程,2014,35(23):100-103.
4李科岑,王晓强,林浩,李雷孝,杨艳艳,孟闯,高静.深度学习中的单阶段小目标检测方法综述[J].计算机科学与探索,2022,16(1):41-58. 被引量：58
5殷昌山,杨林楠,胡海洋.基于注意力机制的农资标签文本检测[J].中国农机化学报,2022,43(10):135-140. 被引量：2
6费晶.基于优化长短期记忆网络的数字资源关系抽取研究[J].自动化与仪器仪表,2023(3):63-66.
7郝巨鸣,杨景玉,韩淑梅,王阳萍.引入Ghost模块和ECA的YOLOv4公路路面裂缝检测方法[J].计算机应用,2023,43(4):1284-1290. 被引量：10
8王亮,张超.一种基于YOLOv5的轻量型行人检测方法[J].工业控制计算机,2023,36(4):84-86. 被引量：4
9吴珺,董佳明,刘欣,王春枝.注意力优化的轻量目标检测网络及应用[J].智能系统学报,2023,18(3):506-516. 被引量：2
10顾允迪,徐望明,何钦.字轮式仪表智能图像抄表系统的设计[J].液晶与显示,2023,38(7):985-996. 被引量：1

1方华,谭必勇.用数据叙事——基于康考迪亚大学口述历史和数字故事中心的个案研究[J].档案管理,2023(3):24-27. 被引量：1
2Xing Fu,Layla P.Padolina.Reform Strategies for Higher Education Management in Collaborative Education[J].Journal of Contemporary Educational Research,2023,7(10):65-70.

智能系统学报

2023年第5期

浏览历史

内容加载中请稍等...

双分支跨级特征融合的自然场景文本检测

参考文献4

二级参考文献28

共引文献24

相关作者

相关机构

相关主题

浏览历史