期刊文献+

鲁棒的多体印刷英文识别系统的实现 被引量:8

Implementation of Robust Multi-Font Printed English Character Recognition System
下载PDF
导出
摘要 文章讨论了设计一个实用的多体英文识别系统中解决的主要问题。该系统能识别多达260种字体,包括斜体和黑体等字体,对训练集的识别率达到99%,对实际文本测试的错误率比TH-OCR2000低56%。文章详细阐述了文本行字切分,特征提取和分类器设计,以及后处理所使用的常用技术,对各种技术的特点进行了分析和比较,并提出了一些新的技术。文章对于OCR系统的设计具有一定的指导意义。 This paper addresses the main problems in designing a multi-font English character recognition system.The system can recognize more than260kinds of fonts,including italic font and black font.The recognition ratio in training set is99%,and the error recognition ratio in real-world documents is56%lower than TH-OCR2000.Techniques of text line segmentation and character segmentation,feature extraction and classifier design,and post-processing are discussed in detail.Characteristics of techniques are analyzed and compared.Some novel techniques are provided in the paper.This paper can be used as guidance for OCR system design.
出处 《计算机工程与应用》 CSCD 北大核心 2001年第20期120-122,共3页 Computer Engineering and Applications
基金 国家863高技术计划(编号:863-306-ZT03-03-1) 国家自然科学基金(编号:69972024)
关键词 多体印刷英文识别系统 分类器 特征提取 字符切分 OCR,Character Segmentation,Feature Extraction,Classifier Design,Post-Processing
  • 相关文献

参考文献4

  • 1[1]Y Yi Lu,B Haist,L Harmon et al.An accurate and Efficient System for Segmenting Machine-printed Text[R].U.S.Postal Service 5th Advanced Technology Conference,1992;3:A-93-A-105
  • 2[2]Fumitaka Kimura,Kenji Takashina,Shinji Tsuruoka. Modified Quardratic Discriminant Functions and the Application to Chinese Charac ter Recognition[J].IEEE ,PAMI, 1987 ;9(1): 149-153
  • 3[3]Alman Kundu,Yanghe,Paramvir Bahl. Recognition of Handwritten Word:First and Second Order Hidden Markov Model Based Approach[J].Pattern Recognition, 1989;22(3) :283-297
  • 4[4]H Takahashi,N Itoh,T Amano et al.A Spelling Correction Method and Its Application to An OCR System[J].Pattern Recognition, 1990;23 (3/4): 363-377

同被引文献47

引证文献8

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部