摘要
文档图像处理技术是实现对网络上以"图片化"形式发送的垃圾邮件进行检测和过滤的有效手段。该文对彩色文档图像的版面进行分析,目的是分割出图像中的特定目标,便于分析并检测出文档图像中是否含有特别字符信息,从而使得网络垃圾邮件过滤系统可以根据这些信息判断是否过滤该邮件。实验结果表明,上述方法可以在不同颜色深度和不同几何结构的彩色文档图像中进行有效的检测,具有较好的实用性和应用价值。
Document image analysis technology provides an effective tool for filtering junk mails in a graphic form. The aim of analyzing the color document image layout is to segment particular objects in the document image, so that the downstream steps can analyze and inspect whether there are special words in the document image. The network junk-mail-filter system can use this information to identify whether to filter the mail or not. Experiments on this system show that the method is efficient in inspecting different color and gray document images with different geometric structure. The proposed method has potential applications in document image information extraction and filtering.
出处
《计算机工程》
CAS
CSCD
北大核心
2008年第15期231-233,共3页
Computer Engineering
关键词
文档图像
版面分析
连通元
归一
document image
layout analysis
connected components
normalization