摘要
针对发票识别无法提取发票中被印章遮挡住的文本信息,提出了一种基于神经网络的发票文字检测与识别方法。首先使用轻量级深度神经网络检测出印章区域,利用区域的颜色信息分离印章和区域内文字信息,其中分离到的印章包含被印章覆盖的文字。然后对于印章,通过色彩阈值设定的方式提取被印章覆盖的字符。最后将区域内的文字和印章覆盖的文字合并,实现去除印章的目的。实验表明:算法有效减少计算量。算法在发票印章区域内文字检测准确率提高了53%、识别准确率提高了20%,实验证明了本文算法的可行和有效性。
Existing invoice identification methods have a deficiency in extracting text information covered by the seal.We propose a method for invoice text detection and recognition based on the neural network.Firstly,the seal's area is detected by using a light-weight neural network.Then,the seal and the text are separated according to their color.Since the extracted seal's area contains text information covered by the seal,a color threshold is used to extract the text under the seal.Finally,in order to remove the seal in the invoice,the text extracted in the seal's area and the text covered by the seal are fused.Experimental results show that our method is computationally efficient.Additionally,the accuracy of text detection and recognition in the seal's area are improved by 53%and 20%,respectively.It proves the feasibility and effectiveness of the proposed method.
作者
蒋冲宇
鲁统伟
闵峰
熊寒颖
胡记伟
JIANG Chongyu;LU Tongwei;MIN Feng;XIONG Hanying;Hu Jiwei(School of Computer Science and Engineering,Wuhan Institute of Technology,Wuhan 430205,China)
出处
《武汉工程大学学报》
CAS
2019年第6期586-590,共5页
Journal of Wuhan Institute of Technology
基金
武汉工程大学研究生教育创新基金(CX2018202)
关键词
印章
发票
颜色提取
神经网络
seal
invoice
color extraction
neural network