期刊文献+

面向文本识别的CRNN模型的改进

Improvement of CRNN Model for Text Recognition
下载PDF
导出
摘要 复杂场景下文本识别因阴影、残缺、模糊、虚化等因素会出现识别精度下降问题。鉴于此,提出一种基于特征融合与双向简化门结构的CRNN模型。首先引入特征融合机制改进卷积神经网络(CNN)模型,利用特征金字塔结构,多加一条自底向上的路径,将低层特征与高层特征融合在一起,以保留更多低层细节特征,提高场景文本识别精度;其次通过合并遗忘门与输入门,得到结构更简单、计算量和参数量更少的简化门结构替换长短期记忆(LSTM)网络改进循环神经网络(RNN)模型部分;最后设计消融实验验证改进后模型的有效性。三个数据集的测试结果表明:在ResNet50做主干网络时,与原始模型相比,改进后模型准确率提升了1.5%以上;在MobileNetV3做主干网络时,准确率提升了1.4%以上。 In complex scenarios,text recognition may experience a decrease in recognition accuracy due to factors such as shadows,imperfections and blurring.In view of this,a CRNN model based on feature fusion and bidirectional simplified gate structure is proposed.Firstly,a feature fusion mecha-nism is introduced to improve the CNN model.Utilizing the feature pyramid structure,an additional bottom-up path is added to fuse low-level features with high-level features,in order to retain more low-level detailed feature information and improve the accuracy of scene text recognition.Second-ly,by merging forgetting gates and input gates,a simplified gate structure with less computation and parameter complexity is used to replace LSTM to improve the RNN model.Finally,ablation experi-ments are conducted to verify the effectiveness of the improved CRNN model.By testing three data-sets the experimental results show that when ResNet50 is used as the backbone network,the accura-cy of the proposed model is improved by more than 1.5%compared to the original model;when u-sing MobileNetV3 as the backbone network,the accuracy is improved by over 1.4%.
作者 吕艳辉 刘明鑫 LÜYanhui;LIU Mingxin(Shenyang Ligong University,Shenyang 110159,China)
出处 《沈阳理工大学学报》 CAS 2024年第4期27-31,共5页 Journal of Shenyang Ligong University
基金 辽宁省教育厅高等学校基本科研项目(JYTMS20230192)。
关键词 特征融合 长短期记忆网络 简化门结构 feature fusion long short-term memory network simplified gate structure
  • 相关文献

参考文献10

二级参考文献46

共引文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部