期刊文献+

基于可微分架构搜索的端到端场景文字检测及识别算法 被引量:1

End-to-end scene character detection and recognition algorithm based on differentiable architecture search
下载PDF
导出
摘要 在自然场景文字检测和识别任务中,现有大多数方法的文字检测和文字识别过程相对独立,导致这些方法处理速度较慢;此外,这些方法的训练和推理过程较为复杂,并且手工设计合理的架构比较困难。针对以上这些问题,基于可微分架构搜索方法提出了多分支自动选择网络(MBASNet),该网络由数个多分支自动选择块(MBASB)组成。MBASB能在不显著增加计算量的情况下通过自动搜索检测和识别性能较优的子分支结构,组合多个MBASB得到整个检测和识别网络。所提出的MBASNet可以同时训练检测子网络和识别子网络,降低文字检测和识别任务中网络的训练和推理难度,提高对文字的检测和识别速度。MBASNet在ICDAR2013数据集上取得了89.4%的精确率和91.4%的召回率,在ICDAR15数据集上取得了80.5%的精确率和86.8%的召回率,并且计算速度达到了每秒68帧。 When most existing methods are used for scene character detection and recognition,the processes of character detection and recognition are relatively independent,which leads to the problem slow processing speed;in addition,the training and inference processes are relatively complex,and it is difficult to design a reasonable architecture manually.To solve these problems,a Multi-Branch Automatic Selection Network(MBASNet)was proposed based on the differentiable architecture search method,which consisted of several Multi-Branch Automatic Selection Blocks(MBASBs).The MBASB could automatically search the subbranch structure with better performance,and the subnetwork did not significantly increase the computational cost.Multiple MBASBs were combined to obtain the whole detection and recognition network.The proposed MBASNet could train the detection and the recognition subnetworks at the same time,which reduced the difficulty of network training and inference in character detection and recognition tasks,meanwhile,it improved the detection and recognition speed.The proposed MBASNet achieved 89.4%precision and 91.4%recall on the ICDAR2013 dataset,80.5%precision and 86.8%recall on the ICDAR15 dataset,and the computational speed reached 68 Frames Per Second(FPS).
作者 刘嘉艺 曹冬平 钟勇 LIU Jiayi;CAO Dongping;ZHONG Yong(Chengdu Institute of Computer Applications,Chinese Academy of Sciences,Chengdu Sichuan 610041,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处 《计算机应用》 CSCD 北大核心 2023年第S01期81-87,共7页 journal of Computer Applications
基金 四川省科技成果转化计划项目(2020ZHZY0002)
关键词 深度学习 卷积神经网络 文本检测 文字识别 可微分架构搜索 deep learning Convolutional Neural Network(CNN) text detection character recognition differentiable architecture search
  • 相关文献

参考文献2

二级参考文献7

共引文献21

同被引文献19

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部