摘要
针对深度学习文本检测算法存在运行速度慢、模型体积大等问题,提出了基于改进的YOLOv3(You Only Look Once v3)文本检测方法(mobile-text-YOLOv3)。通过深度可分离卷积思想轻量化Darknet-53网络,在高层特征借助双线性插值和偏移层使卷积核具有可变感受野,较大地改善了模型的性能;改进D-IOU,引入宽度惩罚,改善了锚框(anchor)在垂直方向稀疏和回归目标形状时不平衡的问题,提高了检测精度。实验结果表明,该改进算法精度比YOLOv3提高7个百分点,检测速度最高可达22 frame/s,与同类算法相比有更快的检测速度和更小的模型体积。
To solve the problems of the deep learning text detection algorithm such as slow running speed and large model size,an improved You Only Look Once v3(YOLOv3)text detection method(mobile-text-YOLOv3)is proposed.With the help of deep separable convolution thought and light Darknet-53 network,the convolution kernel has a variable receptive field with the help of bilinear interpolation and offset layer in the high-level features,which greatly improves the performance of the model.D-IOU is improved and width penalty is introduced to improve the imbalance problem of anchor frame when it is sparse in the vertical direction and returns to the target shape,thus improving the detection accuracy.The experimental results show that the improved algorithm improves the accuracy by 7 percentage points compared with YOLOv3.The maximum detection speed of the interferometer is 22 frames per second.Compared with similar algorithms,the improved algorithm has faster detection speed and smaller model size.
作者
王霏
黄俊
文洪伟
WANG Fei;HUANG Jun;WEN Hongwei(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《电讯技术》
北大核心
2022年第1期130-137,共8页
Telecommunication Engineering
基金
国家自然科学基金资助项目(61671095)。
关键词
自然场景
文本检测
深度可分离卷积
可变形卷积
natural scenes
text detection
depth separable convolution
deformable convolution