基于字符连接的场景文本检测

Scene Text Detection Based on the Connection of the Characters

下载PDF

导出

摘要近年来,场景文本检测的研究方向越发广泛,得益于深度卷积网络与图像分割技术的发展,场景文本检测器能够针对图像中任意形状的弯曲文本,生成多样的文本框。另外,场景图像中的文本时而表现出文字过小,长宽比过于极端等特点,这些实例在深度卷积和有限感受野的情况下,网络很容易丢失小文本的特征信息,获取不到长文本的完整特征。针对这两个难点,论文设计了基于字符连接的场景文本检测器,使用改进的AFF模块,将局部特征与全局特征融合起来,使网络对小文本目标更加敏感,避免小文本漏检的问题。网络输出字符区域与字符间隙得分,根据字符之间的连接属性连接文本行,使网络在有限感受野的情况下能够检测任意长文本。由于通用文本检测数据集缺少字符级的标注,论文使用弱监督学习策略来生成字符级伪标签,并制作了字符级的合成数据集来弥补弱监督学习的不足,使网络能够更好地学习场景文本的特征。实验结果表明,该方法在通用数据集ICDAR2015以及MSRA-TD500上均展现了优异的性能。 In recent years,the research direction of scene text detection is more and more extensive.Thanks to the develop⁃ment of deep convolutional network and image segmentation technology,scene text detector can generate a variety of text boxes for the curved text of any shape in the image.In addition,the text in the scene image sometimes shows the characteristics of too small text,too extreme aspect ratio and so on.Under the circumstance of deep convolution and finite receptive field,the network is easy to lose the feature information of small text and cannot obtain the complete feature of long text.Aiming at these two difficulties,this pa⁃per designs a scene text detector based on character connection,and uses the improved AFF module to fuse local features with glob⁃al features to make the network more sensitive to small text targets and avoid the problem of small text missing detection.The net⁃work output character area and character gap are scored,and text lines are connected according to the connection property between characters,so that the network can detect arbitrary long text.Since the general text detection dataset lacks character-level annota⁃tions,weakly supervised learning strategy is used to generate character-level pseudo-labels,and character-level synthetic dataset is made to make up for the deficiency of weakly supervised learning,so that the network can better learn the features of scene text.Experimental results show that the method has excellent performance on general dataset ICDAR2015 and MSRA-TD500.

作者王良君季宇航顾维杰 WANG Liangjun;JI Yuhang;GU Weijie(School of Computer and Communication Engineering,Jiangsu University,Zhenjiang 212013)

机构地区江苏大学计算机科学与通信工程学院

出处《计算机与数字工程》 2024年第7期2108-2114,共7页 Computer & Digital Engineering

基金国家自然科学基金项目(编号:61601202) 江苏省自然科学基金项目(编号:BK20140571) 江苏大学高级专业人才科研启动基金项目(编号:14JDG038)资助。

关键词场景文本注意力特征融合弱监督学习字符连接 scene text attentional feature fusion weakly supervised learning connection of the characters

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1陈颂,张福浩,仇阿根,赵习枝,王苑,欧尔格力.图斑数据的多密度属性连接聚类方法[J].测绘通报,2023(7):107-112.
2石晓楠,息佳琦,王英丽.结合多注意力及IE-GAN的面部属性编辑方法[J].黑龙江大学工程学报（中英俄文）,2024,15(1):65-75.
3曹鎏,徐巧玉.基于OC&PGMF的弱监督行人检测方法[J].计算机工程与设计,2024,45(9):2725-2732.
4王正,羊烊,任剑.基于OpenMV图像识别的舰载全自动挂弹车设计与实现[J].舰船电子工程,2024,44(7):197-203.
5王文亭,尹小磊,闫登辉,陈俣.宽带方向图可重构圆形单极子天线设计[J].舰船电子工程,2024,44(7):68-71.
6王凌远,甄国涌,储成群,崔杰.基于ZYNQ的内窥镜图像处理系统设计[J].舰船电子工程,2024,44(7):108-112.

计算机与数字工程

2024年第7期

浏览历史

内容加载中请稍等...

基于字符连接的场景文本检测

相关作者

相关机构

相关主题

浏览历史