基于双分支序列残差注意力的场景文本图像超分辨率重建

Scene Text Image Super-Resolution Reconstruction Based on Dual-Branched Sequence Residual Attention

下载PDF

导出

摘要针对现有场景文本图像超分辨率重建方法存在的重建文本图像细节信息丢失和边缘模糊的问题,提出一种基于双分支序列残差注意力的重建方法DSRASRN。首先,DSRASRN采用一种新的双分支序列残差注意力模块(DSRAB),该模块采用双分支结构分别专注于水平和垂直方向上的上下文信息提取,并通过高效通道注意力(ECA)机制给予重要信息更高的权重,以增强特征的表达;其次,在DSRASRN内新增文本边缘感知模块(TEAB),增强对文本图像边缘细节和纹理的处理,TEAB采用特定方向的卷积核捕捉特定空间方向上的信息,同时结合具有不同空洞率的空洞卷积来扩大感受野并增强对高频信息的重建能力。在真实场景文本图像数据集TextZoom上的实验结果表明,DSRASRN不仅可以重建出更多的图像细节信息,而且在提高文本识别准确率方面也表现出明显优势。与TSRN、TBSRN、TG、TPGSR方法相比,DSRASRN的峰值信噪比(PSNR)分别提升0.27、0.78、0.59和0.51 dB,且DSRASRN可以使文本识别器ASTER、MORAN和CRNN的平均文本识别精度分别达到65.0%、62.1%和52.0%。此外,真实场景文本识别图像数据集ICDAR2015和SVT上的测试结果表明DSRASRN具有良好的泛化能力。 This paper proposes Dual-branch Sequence Residual Attention for Super-resolution Reconstruction Network(DSRASRN)to address the drawbacks of loss of detail information and edge blurring in text images reconstructed by existing scene text image super-resolution reconstruction methods.In DSRASRN,first,a Dual-branch Sequence Residual Attention Block(DSRAB)is adopted to obtain a more comprehensive and accurate representation of contextual information.DSRAB uses a dual-branch structure to extract horizontal and vertical context information and adopts an Efficient Channel Attention(ECA)mechanism to assign higher weights to more important information,thereby enhancing the expression ability of the captured features.Second,DSRASRN adds a Text Edge Awareness Block(TEAB)to enhance the processing of edge details and textures of text images.In TEAB,convolution kernels are applied to capture information in specific spatial directions,and a dilated convolution with different dilation rates to increase the ability to reconstruct high-frequency information is adopted.DSRASRN and four state-of-the-art reconstruction methods:TSRN,TBSRN,TG,and TPGSR are evaluated on the TextZoom dataset.Experimental results show that DSRASRN reconstructs text images with more detail and exhibits superior performance by achieving higher text recognition accuracy.Compared with the other four evaluated methods,DSRASRN improves Peak Signal-to-Noise Ratio(PSNR)by up to 0.27,0.78,0.59,and 0.51 dB,respectively.The average text recognition accuracies of ASTER,MORAN,and CRNN are 65.0%,62.1%,and 52.0%,respectively.In addition,the generalization ability of DSRASRN was evaluated on the ICDAR2015 and SVT datasets.The experimental results show that DSRASRN achieves good generalization.

作者李大海吕春桂王振东 LI Dahai;L Chungui;WANG Zhendong(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,Jiangxi,China)

机构地区江西理工大学信息工程学院

出处《计算机工程》 CAS CSCD 北大核心 2024年第9期286-295,共10页 Computer Engineering

基金国家自然科学基金(61563019,61562037)。

关键词超分辨率重建场景文本图像双分支序列残差特征增强边缘感知 super-resolution reconstruction scene text image dual-branched sequence residuals feature enhancement edge awareness

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1刘鑫,唐红梅,席建锐,梁春阳.基于退化感知和序列残差的图像盲超分辨率重建[J].计算机应用研究,2023,40(9):2869-2874.
2储岳中,汪康,张学锋,刘恒.基于残差密集注意力网络的图像超分辨率重建[J].苏州科技大学学报（自然科学版）,2024,41(3):75-84.
3沈学利,朱晓铭,金海波.融合移位卷积与边缘检测的图像动态超分辨率重建[J].计算机系统应用,2024,33(9):65-76.
4何润昌,吐尔逊·买买提,刘健,朱兴林,何春光,董俊,徐粒.改进的YOLOv8的路面裂缝识别算法[J].交通科技与经济,2024,26(5):65-72.
5陈璇,张雪原,王家琦,殷鹏展,叶明全.融合CNN和Transformer的颅内动脉瘤CTA图像分割[J].吉林医药学院学报,2024,45(5):330-334.
6梁诗晴,陈泽伟,岳晓丽,李婧,张家晖,龚向东.2018-2023年中国尖锐湿疣流行趋势及时空分布特征[J].中华流行病学杂志,2024,45(8):1073-1078.
7牛克诚.卷首语[J].美术观察,2024(9):3-3.
8张胤楷,李诗航.高等教育与区域经济的耦合协调测度探析[J].深圳信息职业技术学院学报,2024,22(4):36-41.
9杨曙.基于空间呈现与时代发展的《二泉映月》文化传播[J].浙江艺术职业学院学报,2024,22(2):90-95.
10叶旭辉,倪蔚恒,陈燕,尹芹凯,张道德.基于生成对抗网络的工业场景低质图像增强算法[J].组合机床与自动化加工技术,2024(9):41-45.

计算机工程

2024年第9期

浏览历史

内容加载中请稍等...

基于双分支序列残差注意力的场景文本图像超分辨率重建

相关作者

相关机构

相关主题

浏览历史