期刊文献+

基于字符连接的场景文本检测

Scene Text Detection Based on the Connection of the Characters
下载PDF
导出
摘要 近年来,场景文本检测的研究方向越发广泛,得益于深度卷积网络与图像分割技术的发展,场景文本检测器能够针对图像中任意形状的弯曲文本,生成多样的文本框。另外,场景图像中的文本时而表现出文字过小,长宽比过于极端等特点,这些实例在深度卷积和有限感受野的情况下,网络很容易丢失小文本的特征信息,获取不到长文本的完整特征。针对这两个难点,论文设计了基于字符连接的场景文本检测器,使用改进的AFF模块,将局部特征与全局特征融合起来,使网络对小文本目标更加敏感,避免小文本漏检的问题。网络输出字符区域与字符间隙得分,根据字符之间的连接属性连接文本行,使网络在有限感受野的情况下能够检测任意长文本。由于通用文本检测数据集缺少字符级的标注,论文使用弱监督学习策略来生成字符级伪标签,并制作了字符级的合成数据集来弥补弱监督学习的不足,使网络能够更好地学习场景文本的特征。实验结果表明,该方法在通用数据集ICDAR2015以及MSRA-TD500上均展现了优异的性能。 In recent years,the research direction of scene text detection is more and more extensive.Thanks to the develop⁃ment of deep convolutional network and image segmentation technology,scene text detector can generate a variety of text boxes for the curved text of any shape in the image.In addition,the text in the scene image sometimes shows the characteristics of too small text,too extreme aspect ratio and so on.Under the circumstance of deep convolution and finite receptive field,the network is easy to lose the feature information of small text and cannot obtain the complete feature of long text.Aiming at these two difficulties,this pa⁃per designs a scene text detector based on character connection,and uses the improved AFF module to fuse local features with glob⁃al features to make the network more sensitive to small text targets and avoid the problem of small text missing detection.The net⁃work output character area and character gap are scored,and text lines are connected according to the connection property between characters,so that the network can detect arbitrary long text.Since the general text detection dataset lacks character-level annota⁃tions,weakly supervised learning strategy is used to generate character-level pseudo-labels,and character-level synthetic dataset is made to make up for the deficiency of weakly supervised learning,so that the network can better learn the features of scene text.Experimental results show that the method has excellent performance on general dataset ICDAR2015 and MSRA-TD500.
作者 王良君 季宇航 顾维杰 WANG Liangjun;JI Yuhang;GU Weijie(School of Computer and Communication Engineering,Jiangsu University,Zhenjiang 212013)
出处 《计算机与数字工程》 2024年第7期2108-2114,共7页 Computer & Digital Engineering
基金 国家自然科学基金项目(编号:61601202) 江苏省自然科学基金项目(编号:BK20140571) 江苏大学高级专业人才科研启动基金项目(编号:14JDG038)资助。
关键词 场景文本 注意力特征融合 弱监督学习 字符连接 scene text attentional feature fusion weakly supervised learning connection of the characters
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部