摘要
现有算法对于笔画中含有大量离散笔画点和附加部分的手写体文本,分割性能较低。针对该问题,提出一种基于分段式前景涂抹和背景细化的文本行分割算法。对前景部分实施分段式涂抹,并删除长宽比不满足条件的膨胀区域,以获得文本区域的定位,利用图像背景的细化获取文本行分割线,给出重心判定算法,从而解决上下文本行之间的文字重叠问题。对210幅图片、2 563个文本行进行实验,结果表明,该算法的出错率仅为3.3%,低于水平投影算法、分段式投影算法和聚类算法,能对文本行进行较为完整的分割。
For script text strokes which contain a large number of discrete points and the additional part, division performance of the existing algorithm is low. In order to solve this problem, this paper proposes a text line segmentation algorithm based on segmented foreground daub and background thinning. This algorithm makes sectional daub on outlook parts, removes the expansion area that length-width ratio does not meet the conditions, gets the text area location, gets the text line segmentation line with image background thinning, gives the weight determination algorithm to solve the context words overlapping problem between the text. Experimental results show that the error rate of this algorithm is only 3.3%, is lower than horizontal projection algorithm, sectional projection algorithm and clustering algorithm. It can make relatively complete segmentation for text lines in 210 pictures, 2 563 text lines.
出处
《计算机工程》
CAS
CSCD
2013年第5期204-208,共5页
Computer Engineering
基金
国家自然科学基金资助项目(61065001)
新疆维吾尔自治区科技厅少数民族特殊培养计划基金资助项目(201023116)
教育部新世纪优秀人才支持计划基金资助项目(NCET-10-0969)
关键词
手写文本
文本行分割
分割性能
涂抹
背景细化
handwritten text
text line segmentation
segmentation performance
daub
background thinning