摘要
近年来场景文本检测技术飞速发展,提出一种可适用于任意形状文本检测的新颖算法Mask Text Detector.该算法在Mask R-CNN的基础上,用anchor-free的方法替代了原本的RPN层生成建议框,减少了超参、模型参数和计算量.还提出LQCS(Localization Quality and Classification Score)joint regression,能够将坐标质量和类别分数关联到一起,消除预测阶段不一致的问题.为了让网络区分复杂样本,结合传统的边缘检测算法提出Socle-Mask分支生成分割掩码.该模块在水平和垂直方向上分区别提取纹理特征,并加入通道自注意力机制,让网络自主选择通道特征.我们在三个具有挑战性的数据集(Total-Text、CTW1500和ICDAR2015)中进行了广泛的实验,验证了该算法具有很好的文本检测性能.
Scene text detection technology has developed rapidly in recent years.This paper proposes a novel algorithm Mask Text Detector that can be used for Text detection of any shape.Based on Mask R-CNN,this algorithm replaces the original RPN layer to generate proposal boxes with an anchor-free method, which reduces hyperparameters, model parameters, and computation.The LQCS(Localization Quality and Classification Score)is also proposed, which can joint regression associates coordinate quality and category score together, to eliminate the problem of inconsistency in the prediction stage.In order to allow the network to distinguish complex samples, this paper combines the traditional edge detection algorithm to propose a Socle-Mask branch to generate segmentation masks.The module extracts texture features separately in the horizontal and vertical directions, and adds a channel self-attention mechanism to allow the network to choose channel features independently.Extensive experiments are conducted on three challenging datasets(Total-Text, CTW1500,and ICDAR2015)to verify that the algorithm has good performance in text detection.
作者
向伟
程博
杨航
祝来李
武钰智
王雅丽
XIANG Wei;CHENG Bo;YANG Hang;ZHU Lai-li;WU Yu-zhi;WANG Ya-li(Key Laboratory of Electronic and Information Engineering of State Ethnic Affairs Commission,Southwest Minzu University,Chengdu 610041,China)
出处
《西南民族大学学报(自然科学版)》
CAS
2022年第6期660-666,共7页
Journal of Southwest Minzu University(Natural Science Edition)
基金
国家自然科学基金(62073270)
西南民族大学中央高校专项项目(2020NYBPY02)。
关键词
目标检测
文本检测
图像处理
分割网络
target detection
text detection
image processing
segmentation network