摘要
在基于一种知识工程和统计学习相结合的文本信息分类算法基础上,提出了利用文本在名片图像中的版面位置信息来辅助分类。此方法充分利用了名片版面中各种文本内容之间在图像中空间位置上的相互关系,对提高名片信息的分类准确性有显著的效果。
In this paper, based on the combined method of statistical learning approach and knowledge engineering approach for text categorization, we propose to use layout information in images to improve automated categorization for text information in business cards. This method takes the full advantage of the mutual relations in layout of different kinds of text in the image of the business card, and improves obviously the accuracy of the text classification result.
出处
《电视技术》
北大核心
2004年第8期67-70,共4页
Video Engineering
基金
国家863高技术计划(2001AA114081)
国家自然科学基金(60241005)
关键词
文本分类
图像版面信息
名片
OCR系统
text classification
layout information in images
business card
OCR system