摘要
针对目前提高印刷图文版面OCR识别率所存在的技术瓶颈,提出了基于上下文词库的相似字型识别技术,以解决汉字集合中大量存在的相似字符的精确识别问题。实验系统测试表明,新方法对于提高印刷图文版面中相似字型的识别率具有明显效果。
It introduces a new method to improve the distinguishing precision for printed papers, proposes an OCR method based on a context vocabulary library. The experiment system proves that the new method is effective with a rather- improved precision for discriminating similar Chinese characters.