An end-to-end text spotter with text relation networks

导出

摘要 Reading text in images automatically has become an attractive research topic in computer vision.Specifically,end-to-end spotting of scene text has attracted significant research attention,and relatively ideal accuracy has been achieved on several datasets.However,most of the existing works overlooked the semantic connection between the scene text instances,and had limitations in situations such as occlusion,blurring,and unseen characters,which result in some semantic information lost in the text regions.The relevance between texts generally lies in the scene images.From the perspective of cognitive psychology,humans often combine the nearby easy-to-recognize texts to infer the unidentifiable text.In this paper,we propose a novel graph-based method for intermediate semantic features enhancement,called Text Relation Networks.Specifically,we model the co-occurrence relationship of scene texts as a graph.The nodes in the graph represent the text instances in a scene image,and the corresponding semantic features are defined as representations of the nodes.The relative positions between text instances are measured as the weights of edges in the established graph.Then,a convolution operation is performed on the graph to aggregate semantic information and enhance the intermediate features corresponding to text instances.We evaluate the proposed method through comprehensive experiments on several mainstream benchmarks,and get highly competitive results.For example,on the SCUT-CTW1500,our method surpasses the previous top works by 2.1%on the word spotting task.

作者 Jianguo Jiang Baole Wei Min Yu Gang Li Boquan Li Chao Liu Min Li Weiqing Huang

机构地区 Institute of Information Engineering School of Cyber Security Centre for Cyber Security Research and Innovation

出处《Cybersecurity》 EI CSCD 2021年第1期91-103,共13页 网络空间安全科学与技术（英文）

关键词 Scene text spotting Graph convolutional network Visual reasoning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1Jia-zhi XIA,Yu-hong ZHANG,Hui YE,Ying WANG,Guang JIANG,Ying ZHAO,Cong XIE,Xiao-yan KUI,Sheng-hui LIAO,Wei-ping WANG.SuPoolVisor:a visual analytics system for mining pool surveillance[J].Frontiers of Information Technology & Electronic Engineering,2020,21(4):507-523. 被引量：5
2HUANG Jitao,ZENG Guohui,HUANG Bo,GAO Yongbin,LIU Jin,SHI Zhicai.Knowledge Enhanced Pre-Training Model for Vision-Language-Navigation Task[J].Wuhan University Journal of Natural Sciences,2021,26(2):147-155. 被引量：1
3李艳,叶明确.基于Cooc的长三角一体化研究知识图谱分析[J].技术经济与管理研究,2022(1):116-121. 被引量：11
4Roya Khoii,Samira Sharififar.Memorization Versus Semantic Mapping in L2 Vocabulary Acquisition(IV)[J].基础教育外语教学研究,2021(7):29-33.
5Guangcai Liang.Altered gut bacterial and metabolic signatures and their interaction in inflammatory bowel disease[J].Synthetic and Systems Biotechnology,2021,6(4):377-383.
6周天雄.Application of Theme-Rheme Theory to News Translation[J].语言与文化研究,2020(2):113-118.
7Dandan Peng,Le Sun.A Database-Driven Algorithm for Building Top-k Service-Based Systems[J].Journal of Quantum Computing,2020,2(4):171-179.
8Xiaojing Liang.Application of Information Processing Theory in Second Language Vocabulary Acquisition[J].Journal of Contemporary Educational Research,2021,5(12):109-113.
9Zhi Liu,Kai Mi,Zhenjiang Zech Xu,Qiankun Zhang,Xingyin Liu.PM2RA: A Framework for Detecting and Quantifying Relationship Alterations in Microbial Community[J].Genomics, Proteomics & Bioinformatics,2021,19(1):154-167.
10Xiaohu Lin,Bisheng Yang,Fuhong Wang,Jianping Li,Xiqi Wang.Dense 3D surface reconstruction of large-scale streetscape from vehicle-borne imagery and LiDAR[J].International Journal of Digital Earth,2021,14(5):619-639.

Cybersecurity

2021年第1期

浏览历史

内容加载中请稍等...

An end-to-end text spotter with text relation networks

相关作者

相关机构

相关主题

浏览历史