摘要
结合数字图书馆系统建设和应用的需求,文章提出一种基于文档分割的自适应文档图像兴趣域编码方法.文章针对数字化文档结构特征,详细描述了一种基于块缩图和涂染技术的快速图文分割算法,该算法分割图文时不受文本倾斜和插图区域不规则的限制;在将插图域和文本域进行准确分割的基础上,文章提出了一种自适应生成插图兴趣域屏蔽图和兴趣域位移法的压缩编码算法,最后给出采用该方法压缩含插图的扫描文档的示例.
This paper proposes a new approach to adaptively detect and to code the region of interest of scanned document, which makes use of the result of a fast segmentation algorithm based on the bi-level reduced image in order to satisfy the needs of application of the digital library.The paper firstly describes in detail a block technique to reduce the original image,a modified smearing method to simplify computation and a fast segmentation algorithm. Based on the result of document segmentation, the paper then introduces a generating algorithm of ROI mark image and the max-shift method of document compression. Finally the paper shows an example of compressing scanned document, which includes both picture areas and text areas, by means of our scheme.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2002年第12A期1943-1946,共4页
Acta Electronica Sinica
基金
教育部优秀年轻教师基金(教人司[2000]11)
重庆市科技攻关项目(No.2000-5672)