期刊文献+

基于模板匹配和Tesseract的票据归类和索引 被引量:5

Classification and Index of Voucher Based on Template Match and Tesseract
下载PDF
导出
摘要 现在大量的纸质凭证都需要通过扫描存入计算机,但如何对这些凭证进行归类和检索成为一个主要问题。随着OCR技术的发展,已有软件产品能够完成扫描件的识别和管理。但在很多情况下只需对扫描件进行归类和建立索引,并不需要对整张票据进行OCR识别。本文提出一种快速、有效的基于模板匹配的票据分类方法,然后借助开源软件Tes-seract实现数字和字母的识别,完成票据的分类和索引功能。所提出的方法简便、高效,有效地降低了企业成本。另外,为了提高识别率,根据待识别对象的特征对图像进行预处理,实验表明该方法可以极大提高识别率,对专业的OCR软件也具借鉴意义。 Nowadays a lot of paper documents are stored in the computer by scanner,but how to classify and retrieve these credentials becoming a main issue.With the OCR technology,there are many software products which can recognize and manage paper documents.However,in many cases,only classifying and indexing is necessary,rather than recognizing a whole credential.The paper presents a fast and efficient classifying approach for credentials based on template matching,and employs Tesseract,which is open source,to recognize optical numbers and letters.The approach is simple,efficient and effective in reducing customer's costs.In addition,in order to get a higher recognition rate,images are preprocessed according to the specific characteristics of credentials.Experiments show that the approach can greatly improve the recognition rate,and it is also a illumination to professional OCR software.
出处 《计算机与现代化》 2010年第7期132-135,共4页 Computer and Modernization
关键词 光学字符识别 模板匹配 图像增强 Tesseract optical character recognition(OCR) template match image enhancement Tesseract
  • 相关文献

参考文献14

二级参考文献38

共引文献61

同被引文献25

引证文献5

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部