期刊文献+

双分支跨级特征融合的自然场景文本检测

Natural scene text detection based on double-branchcross-level feature fusion
下载PDF
导出
摘要 现有的场景文本检测方法在处理任意形状文本时,由于复杂背景的影响会造成文本区域定位不准确、相邻文本漏检误检的问题,基于此提出一种双分支跨级特征融合的自然场景文本检测方法。首先,以Resnet50为主干网络提取初始特征,设计跨级特征分布增强模块(cross-level feature distribution enhancement module,CFDEM),增强跨级特征文本信息的交互性,提高特征的表达能力;然后,为自适应地选择过滤非文本或冗余特征,降低误检率和漏检率,提出自适应融合策略(adaptive fusion strategy,AFS),利用双分支结构加强不同维度特征之间的联系,优化融合过程;最后,预测阶段采用可微分二值化的方法来生成文本检测结果。所提方法在ICDAR2015、ICDAR2017、Total-Text、CTW1500数据集上进行消融实验,实验结果表明该方法能准确定位文本区域,克服文本漏检误检影响。 Current scene text detection methods cause the inaccurate location of text regions and false detection of adja-cent texts due to the influence of complex backgrounds in arbitrarily shaped texts.To solve this issue,a natural scene text detection method based on double-branch cross-level feature fusion is proposed.First,the initial features were ex-tracted using Resnet50 as the backbone network,and then a cross-level feature distribution enhancement module was de-signed to improve the interaction of cross-level feature text information and the expression ability of features.Second,an adaptive fusion strategy was proposed to filter nontext or redundant features adaptively and reduce the false and missed detection rates using the double-branch structure to strengthen the relationship between different dimensional features and optimize the fusion process.Last,the differential binarization method was used to yield text detection res-ults in the prediction phase.The proposed method was employed to perform ablation experiments on the ICDAR2015,ICDAR2017,Total-Text,and CTW1500 datasets.The findings revealed that this method can accurately locate the text area and overcome the impact of text miss and false detections.
作者 刘光辉 张钰敏 孟月波 占华 LIU Guanghui;ZHANG Yumin;MENG Yuebo;ZHAN Hua(School of Information and Control Engineering,Xi’an University of Architecture and Technology,Xi’an 710055,China)
出处 《智能系统学报》 CSCD 北大核心 2023年第5期1079-1089,共11页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金项目(52278125) 陕西省重点研发计划(2021SF-429)。
关键词 文本检测 任意形状 跨级特征分布增强 自适应融合 双分支 空间维度 通道维度 可微分二值化 text detection arbitrarily shaped cross-level feature distribution enhancement adaptive fusion double branch spatial dimension channel dimension differentiable binarization
  • 相关文献

参考文献4

二级参考文献28

  • 1刘涛,汪西莉.采用卷积核金字塔和空洞卷积的单阶段目标检测[J].中国图象图形学报,2020,0(1):102-112. 被引量:11
  • 2[1]JEONG K Y,JUNG K,KIM E Y,et al:Neural network-based text location for news video indexing[J].IEEE Transactions on Information Theory,1998,44(5):319-323.
  • 3[2]KIM K I,JUNG K,KIM J H.Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(12):1631-1639.
  • 4[3]LI H P,DOERMANN D,KIA O.Automatic text detection and tracking in digital video[J].IEEE Transaction on Image Processing,2000,9(1):147-156.
  • 5[4]CHEN X R,ZHANG H J.Text area detection from video frames[A].IEEE Pacific Rim Conference on Multimedia:Advances in Multimedia Information Processing[C].[s.1.],2001.
  • 6[5]IAENHART R,WERNICKE A.Localizing and segmentation text in images and videos[J].IEEE Transactions On Circuits and Systems For Video Technology,2000,12(4):256-268.
  • 7[6]YE Q X,HUANG Q M,GAO W,ZHAO D B.Fast and robust text detection in images and video frames[J].Image Vision and Computing,2005(23):565-576.
  • 8[8]ZHONG Y,ZHANG Hongjiang,JAIN A K.Automatic caption location in compressed video[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(4):385-392.
  • 9[9]LUCAS S M,PANARET0s A,SOSA L.ICDAR 2003 robust reading competition[A].In:IEEE Proceeding of The 7 th International Conference on Document Analysis and Recognition[C].[s.1.],2003.
  • 10姜维,卢朝阳,李静,刘晓佩.基于角点类别特征和边缘幅值方向梯度直方图统计特征的复杂场景文字定位算法[J].吉林大学学报(工学版),2013,43(1):250-255. 被引量:4

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部