期刊文献+

任意方向自然场景文本识别 被引量:2

Text recognition of natural scenes in any direction
下载PDF
导出
摘要 自然场景文本识别是计算机视觉领域一项极具挑战性的任务,为此提出一种适用于任意方向的自然场景文本识别算法。使用高分辨分割网络作为基础框架提取文本的空间信息,利用卷积长短时记忆网络提取文本的时空序列信息,同时通过设计字符注意机制使模型专注于字符上,并采用可微分二值化函数进一步加大网络对前景的注意力,削弱对背景区域的关注,网络对每个像素点进行37分类,并使用文本转录模块将分类结果按照从左到右的顺序转换成文本。该算法在包括ICDAR2013,ICDAR2003,SVTP,CUTE,IIIT5k的多个标准数据集上进行测试,测试结果表明,无论是规则文本还是不规则文本都取得了不错的效果,其中,在弯曲文本CUTE上的识别精度高达83.3%,充分证明了提出算法的有效性。 The recognition of natural scene text has become a very challenging task in the field of computer vision.For this reason,this paper proposes a text recognition algorithm for natural scenes in any direction.The algorithm first uses the high-resolution segmentation network as the basic framework to extract the spatial information of the text.Then the spatiotemporal sequence information of the text is extracted by convolutional long short-term memory(ConvLSTM).Meanwhile,the character attention mechanism is designed so that the model’s attention is on the characters,and the differentiable binarization function is used to further increase the network’s attention to the foreground and to weaken the attention to the background area.Finally,the network divides each pixel into 37 classes,and uses the text transcription module to convert the classification results into text from left to right.The algorithm has been tested on multiple standard datasets,such as ICDAR2013,ICDAR2003,SVT-Perspective,CUTE80 and IIIT-5K.Both regular text and irregular text have achieved good results,and the recognition accuracy on curved text CUTE is as high as 83.3%,which fully proves the effectiveness of the algorithm.
作者 朱莉 陈宏 景小荣 ZHU Li;CHEN Hong;JING Xiaorong(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R.China;College of Physics and Electronic Engineering,Sichuan Normal University,Chengdu 610101,P.R.China;Chongqing Key Lab of Mobile Communications Technology,Chongqing 400065,P.R.China)
出处 《重庆邮电大学学报(自然科学版)》 CSCD 北大核心 2022年第1期125-133,共9页 Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金 国家自然科学基金(61701062) 重庆市基础与前沿研究计划项目(cstc2019jcyj-msxmX0079)。
关键词 自然场景文本识别 卷积长短时记忆网络(ConvLSTM) 字符注意力机制 natural scene text recognition convolutional long short-term memory(ConvLSTM) character attention mechanism
  • 相关文献

同被引文献11

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部