摘要
针对印刷体维文在连体段切分部分存在过切分和漏切分的现象,结合水平投影法和连通域搜索法实现维文文本的行切分和单词切分;同时针对连体段切分正确率不高的问题,提出一种新的切分方法。首先对字母连体段位于基线上方的部分进行竖直投影来寻找所有可能的切点,然后利用阈值判定法去除误切分。实验证明,该方法提高了印刷体维文的切分正确率,为提高维文识别的正确率打下基础。
In this paper, aiming at over-segmentation and under-segmentation existed in the segmentation of printed Uyghur conjoined characters, Uyghur text line segmentation and word segmentation are implemented by using horizontal projection and connected domain search meth-od. Aiming at the inaccurate problem of conjoined segment segmentation, a new segmentation method is proposed. The top section of the baseline of Uyghur conjoined characters is vertically projected to find all possible segmentation points. Then incorrect segmentation characters are re-moved through threshold. Experimental results show that the method can improve the segmenta-tion accuracy of printed Uyghur and lay the foundation for improving the Uyghur recognition rate.
出处
《大连民族学院学报》
CAS
2014年第3期315-318,共4页
Journal of Dalian Nationalities University
基金
中央高校基本科研业务费专项资金资助项目(DC11010103)
关键词
印刷体维吾尔文
字符识别
积分投影
连通域
切分
printed Uyghur
character recognition
integral projection
connected domain
seg-mentation