期刊文献+

多知识综合判决的字符切分算法 被引量:5

A Character Segmentation Algorithm Based on Synthetic Decision
下载PDF
导出
摘要 高性能的印刷体文字识别系统中,在单字识别技术比较成熟的条件下,字符切分成为比较关键的环节。字符切分可以看作是对字符边界正确切分位置的一个决策过程,该决策需要同时考虑字符局部的识别情况和全局的上下文关系。该文通过对中日韩三国文字字符切分的研究,提出一种基于多知识综合判决的字符切分算法。该算法成功应用于AsiaOCR项目,对于东方文字中常见的混排英文问题也能很好处理。实验结果表明,和以前的算法相比,新算法在中日韩三国文字识别系统中的切分错误率平均下降50%。 Character segmentation becomes a crucial process in higher performance document recognition systems.In character segmentation process,geometric information is not enough to accurately decide the position of character edges.The character recognition results as well as contextual information are also needed.Based on the research on Chinese /Japanese /Korean character segmentation,a novel multiple information-based character segmentation decision algorithm is proposed in this paper.This new algorithm has been successfully used in AsiaOCR project,and it can handle English-mixed bilingual text which is common in current documents.Experiments show that the error ratio of character segmen-tation averagely reduced by50%.
出处 《计算机工程与应用》 CSCD 北大核心 2002年第17期59-61,72,共4页 Computer Engineering and Applications
基金 国家863高技术研究发展计划(编号:2001AA114081) 国家自然科学基金(编号:69972024)
关键词 多知识综合判决 字符切分算法 光学字符识别 上下文分析 文字识别 印刷体识别系统 optical character recognition,character segmentation,contextual analysis
  • 相关文献

参考文献10

  • 1[1]Casey R G,Lecolinet E.A survey of methods and strategies in character segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996; 18(7) :690~706
  • 2[2]Xuhong Xiao,Leedham G.Cursive script segmentation incorporating knowledge of writing[C].In :Document Analysis and Recognition ,ICDAR'99,1999: 535~538
  • 3[3]Xian Wang,Govindaraju V,Srihari S.Multi-experts for touching digit string recognition[C].In:Document Analysis and Recognition,ICDAR'99,1999: 800~803
  • 4[4]Lin Yu Tseng,Rung Ching Chen. A new method for segmenting handwritten Chinese characters[C].In :Document Analysis and Recognition,ICDAR'97,1997: 568~571
  • 5[5]Ishidera E,Nishiwaki D,Yamada J.Unconstrained Japanese address recognition using a combination of spatial information and word knowledge[C].In: Document Analysis and Recognition, ICDAR'97,1997:1016~1022
  • 6[6]Scong-Whan Lee,Dong-June Lee,Hee-Seon Park. A new methodology for gray-scale character segmentation and recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996; 18(10):1045~1050
  • 7[7]Kuo H H,Wang J F.A new method for the segmentation of mixed handprinted Chinese/English characters[C]In:Document Analysis and Recognition, ICDAR'93,1993: 810~813
  • 8[8]Seong Whan Lee,Jong-Soo Kim. Multi-lingual,multi-font and multisize large-set character recognition using self-organizing neural network[C].In: ICDAR'95,1995: 28~33
  • 9[9]Tong Zhang,Xiaoqing Ding. Cluster-based bilingual script segmentation and language identification[C].In:ICMI'99
  • 10[10]Yefeng Zheng,Changsong Liu,Xiaoqing Ding. Single character type identification[J].Document Recognition and Retrieval,SPIE,2002

同被引文献32

引证文献5

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部