摘要
本文回顾了汉字识别研究的历程.根据模仿人类视觉模型,基于文字图像的统计模式识别方法是文字识别取得瞩目进展的基础。模式识别信息熵理论揭示了模式分类的信息过程和理论极限,本文讨论了从汉字图像中提取特征以及文字识别分类器设计和学习的各种方法.介绍了文本识别必须解决的文字切分,版面分析、理解和重构,及提高识别性能等重要问题,最后,总结了文字识别研究的重要进展和对今后的展望。
A review for the research on Chinese character recognition is reported in this paper. Based on the theory of Visual thinking of human, more excellent progresses for Chinese character recognition have been achieved by the statistical pattern recognition method on the character image. The information theory on pattern recognition has discovered the nature and limitation of statistical pattern classification capability. The feature extraction and selection from character image, also the classifier design and learning methods are introduced. Besides, the important problems in document recognition such as layout analysis, understanding, and reconstruction, character segmentation, context postprocessing by language model etc. have been discussed. At last some conclusion and prospect for the progress on CCR are reported.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2002年第9期1364-1368,共5页
Acta Electronica Sinica
基金
国家863高技术计划(No.2001AAll4081)
国家自然科学基金(No.69972024)
关键词
汉字识别
文本识别
视觉感知
特征提取
分类器设计
版面分析
Chinese character recognition
document recognition
visual perception
feature extraction
classifier design
layout analysis and understanding