摘要
传统的基于锚点框(anchor box)实现的自然场景文本检测方法中,锚点框容易受到其他文本实例的干扰产生误判或精度降低,且文本实例包含强烈的拓扑特征但并未得到重视,导致在弯曲环形文本检测任务中表现不佳。针对这个问题提出了一种新颖的神经网络结构,引入图卷积神经网络的概念,充分考虑邻近锚点框之间的联系,并融入锚点框的拓扑特征辅助图神经网络的学习,提高整体网络的有效性。在两个公开的自然场景文本检测数据集上进行了消融实验,在公开数据集CTW1500中,本文提出的方法使模型在召回率、精确率、F分数这3个指标上分别提高了3.0%、1.9%以及2.5%,在公开数据集Totel-Text中这3个指标分别是2.2%、1.8%以及2.0%。此外,本文方法还与近年提出的其他文本检测算法进行了比较,实验结果证明本文提出的方法在复杂自然场景下文本检测效果优秀,所提出的模块有利于文本检测性能的提高。
In traditional anchor box-based text detection methods for natural scenes,anchor boxes are prone to interference from other text instances,resulting in erroneous judgments or affecting accuracy.Moreover,text instances contain strong topological features,which are usually be ignored,resulting in poor performance in curved circular text detection tasks.To solve this problem,a novel neural network structure is proposed,which introduces the concept of graph convolutional networks by fully considering the relationship between adjacent anchor frames,and incorporating the topological characteristics of anchor frames to assist the learning of graph neural networks,improving the effectiveness of the overall network.The ablation experiments were conducted on two publicly available natural scene text detection datasets.In the CTW1500 dataset,the proposed method improved the model by approximately 3.0%,1.9%,and 2.5% in terms of recall,accuracy,and F-score,respectively,and in the TotelText dataset,the three values were improved by approximately 2.2%,1.8%,and 2.0%,respectively.In addition,the proposed method has also been compared with other text detection algorithms proposed in recent years.Experimental results show that the proposed method performs well for text detection in complex natural scenes,demonstrating the promising effectiveness of the proposed module for improving the performance of text detection.
作者
郑侠聪
程良伦
黄国恒
王敬超
Zheng Xia-cong;Cheng Liang-lun;Huang Guo-heng;Wang Jing-chao(School of Computer Science and Technology,Guangdong University of Technology,Guangzhou 510006,China)
出处
《广东工业大学学报》
CAS
2024年第3期102-109,共8页
Journal of Guangdong University of Technology
基金
国家自然科学基金资助项目(U20A6003)
国家自然科学基金广东联合基金资助项目(U1801263,U1701262,U2001201)
广东省信息物理融合系统重点实验室项目(2020B1212060069)
佛山市重点领域科技攻关项目(2020001006832)。
关键词
文本检测
自然场景
图神经网络
拓扑特征
text detection
natural scene
graph convolutional networks(GCN)
topological feature