摘要
为了提升低质场景文本图像的文字检测和识别性能,提出一种基于Transformer的字符级场景文本检测算法。依据场景文本行中的字符顺序,设计基于Transformer的编码-解码结构,能够输出每个字符检测框的坐标以及排序预测。根据匈牙利算法的思想,设计了基于字符检测框坐标及排序损失的损失函数,以提高匹配检测结果的准确性。在3个字符级标注的场景文本数据集上的场景字符检测、场景字符识别的相关实验结果表明,所提算法能够取得较好的性能,在多个评价指标上均优于对比算法。
In order to solve the problem of character-level scene text detection and recognition under imperfect imaging conditions, a Transformer based scene character detection algorithm is proposed. Firstly, a Transformer based encoding-decoding structure is designed which takes the order of characters in the text instances into account, so as to output the position and order of sequence information of each character detection box can be output. Then, the Hungarian algorithm is used to calculate the loss of the algorithm which combines bounding box coordinates and ranking losses. Finally, through the experiments on three character-level annotated data sets, we show that under different evaluation metrics, the proposed method is able to achieve good performance on in terms of both scene character localization and recognition.
作者
张重生
陈杰
纵瑞星
杨帅磊
凡高娟
ZHANG Chongsheng;CHEN Jie;ZONG Ruixing;YANG Shuailei;FAN Gaojuan(School of Computer and Information Engineering,Henan University,Kaifeng 475001,China;Henan Provincial Key Laboratory of Big Data Analysis and Processing,Kaifeng 475001,China)
出处
《北京邮电大学学报》
EI
CAS
CSCD
北大核心
2022年第2期124-130,共7页
Journal of Beijing University of Posts and Telecommunications
基金
河南大学重大国际科技合作培育项目。