摘要
单字提取是连续手写中文识别的前提。给出了一种基于笔划间距分类的自适应提取单字方法。该方法计算时间上相邻笔划的水平间距,利用直方图对笔划间距进行分类,先得到属于同一行的笔划集合,然后再对同一行的笔划进行间距分类,从而得到属于同一个字的笔划,实现单字提取。利用直方图对笔划间距进行二分类,对字的大小具有较好的适应性。以单字宽高比为分类终止条件。测试结果表明该方法对连续手写中文具有较好的分割效果。
It is prerequisite to extract character from the continuous handwriting Chinese text for its recognition. The paper proposes a novel approach to adaptively extracting character from the continuous handwriting Chinese text based on classification of stroke - between gaps. It first calculates horizontal gaps between strokes in ink, and classifies them into two classes using histogram analysis to extract strokes belonging to the same text line. Then, the strokes belonging to the same line are processed to extract characters. It is adaptive to size of characters using histogram analysis to classify between - stroke gaps. The classification of strokes tenninates when their width to height ratio is smaller than a threshold. Many applications show that the approach is effective and robust for character extraction from continuous handwriting Chinese text.
出处
《信息技术》
2005年第8期80-82,87,共4页
Information Technology
关键词
连续手写中文
单字提取
直方图分析
continuous handwriting Chinese text
character extraction
histogram analysis