摘要
为实现电力设备铭牌中文字信息的自动识别,提升设备管理的效率,提出一种面向场景文字的文本内容识别方法。该方法依赖于从卷积特征中训练得到的基于长短时记忆网络的视觉注意力模型。一组特征向量从与图像不同区域相对应的卷积层提取,从而将图像的空间信息编码到特征中。基于分配注意力权重,模型可以选择关注图像的不同部分,并结合卷积特征与注意力权重识别文字。进一步引入语言模型并修改集束搜索策略可以显著改善识别效果。在真实数据集上的结果验证了该方法的有效性。
To automatically identify textual information on nameplates of power devices and improve the efficiency of management of devices,a text recognition model of scene text is proposed.The proposed method depends on long short-term memory(LSTM)based visual attention model trained from convolutional features.A series of feature vectors are extracted from convolutional layers corresponding to different regions of image,which encode spatial information into image features.Based on the allocation of attention weights,different parts of the image can be focused on,and text can be recognized through the combination of convolutional features and attention weights.Furthermore,language model is introduced to improve the strategy of beam search and performance of recognition.Experimental results on real-world dataset show the superiority of the proposed method.
作者
刘影
张忠宝
张威
鲁观娜
彭鑫霞
LIU Ying;ZHANG Zhongbao;ZHANG Wei;LU Guanna;PENG Xinxia(State Grid Jibei Electric Power Co.Ltd.,Beijing 102208,China)
出处
《电子器件》
CAS
北大核心
2022年第3期623-627,共5页
Chinese Journal of Electron Devices
基金
国网冀北电力有限公司科技项目(52018K19002G)。
关键词
电力设备
场景文字识别
计算机视觉
卷积神经网络
长短时记忆网络
power device
scene text recognition
computer vision
convolutional neural network
long short-term memory