摘要
在自然场景中对中英文文本的检测与识别受限于图像本身包含的噪声数据,为提高检测识别效率,提出基于YOLOv3与CRNN的自然场景文本识别方法。进行文本角度预测,根据角度预测值旋转图像;采用YOLOv3文本区域检测算法计算若干组固定宽度的文本框;使用聚类算法将这些固定宽度的文本框连接成一个包含完整语义的文本框;采用CRNN算法识别文本检测框中的文本。实验结果表明,YOLOv3与CRNN模型识别100幅图像用时0.4258 s,在同等实验环境中CTPN与DenseNet模型用时0.8250 s,验证了YOLOv3与CRNN模型比CTPN与DenseNet模型具有更高的识别效率。
The detection and recognition of Chinese and English text in natural scene is limited by the noise data contained in the image itself.To improve the efficiency of detection and recognition,a natural scene text recognition method based on YOLOv3 and CRNN was proposed.The text angle was predicted,and the image was rotated according to the angle predicted value.YOLOv3 text region detection algorithm was used to calculate several groups of text boxes with fixed width.These fixed width text boxes were connected into a text box containing complete semantics using clustering algorithm,and the CRNN algorithm was used to identify the text in the text detection box.Experimental results show that YOLOv3 and CRNN model can identify 100 images in 0.4258 s.In the same experimental environment,the time cost of CTPN and DenseNet model is 0.8250 s,which verifies that YOLOv3 and CRNN models have higher recognition efficiency than CTPN and DenseNet models.
作者
吴启明
宋雨桐
WU Qi-ming;SONG Yu-tong(College of Computer and Information Engineering,Hechi University,Yizhou 546300,China;School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China)
出处
《计算机工程与设计》
北大核心
2022年第8期2352-2360,共9页
Computer Engineering and Design
基金
国家自然科学基金项目(61672254)
广西自然科学基金项目(2012GXNSFBA220023、2020GXNSFAA159172)。