摘要
卷积神经网络在自然场景文本检测中的应用,大大提高了文本检测的准确性.但由相机视角和文本本身引起的尺度多变性以及文本分布的多样性仍然给文本检测带来了挑战.从解决文本尺度多变性的角度出发,本文提出了一个新的多层次特征融合模块,在特征金字塔融合不同层级特征的同时,额外添加了一个空洞卷积池化模块分支,在不降低特征尺度的同时拥有不同的感受野,获取了更丰富的特征,有利于缓解文本尺度多变性的问题.本文通过特征注意力机制进一步提取更加适合于文本的特征,有效地实现了不同通道间信息的交互,缓解了因文本分布多样性而带来的检测难题.本文进一步提升了文本检测器的准确率,在ICDAR2015,CTW1500,Total-Text,MSRA-TD500这四个数据集上的实验结果证明了本文所提方法的有效性.
The application of convolutional neural networks in natural scene text detection greatly improves the accuracy of text detection.However,the scale variability caused by camera′s perspective and text sizes,and the diversity of text distribution also bring challenges to text detection.In order to alleviate the problem of text scale variability,we propose a new multi-level feature fusion module.Besides using feature pyramid to fuse features of different levels,we add an additional dilated convolutional and pooling module.It keeps different receptive fields without reducing feature scales,and obtains richer features,which helps to alleviate the problem of text scale variability.We propose an attention mechanism to further extract features which are more suitable for text through the channel attention mechanism,thus cross-channel interaction information is effectively extracted,alleviate the detection problems caused by the diversity of text distribution.We further improve the accuracy of the text detector.The experimental results on four public data sets(ICDAR2015,CTW1500,Total-Text,MSRA-TD500)prove the effectiveness of the method proposed in this paper.
作者
骆文莉
吴秦
LUO Wen-li;WU Qin(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi 214122,China;Jiangsu Provincial Engineering Laboratory for Pattern Recognition and Computational Intelligence,Jiangnan University,Wuxi 214122,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2022年第4期815-821,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61972180)资助。