摘要
如何从海量的图像里将文本图像挑选出来是网络图像处理领域的研究热点。为了达到更好的文本图像识别效果,文章从文本图像的文字特征出发,提出了一种基于连通区域矩阵的文本图像识别方法。首先对图像进行二值化,计算二值化后图像的连通区域矩阵,然后根据连通区域矩阵提取出图像的8维特征值,最后使用BP神经网络来对图像进行训练和识别。实验证实,该方法在保证较高识别率的同时,明显降低了误识率。
How to pick out document images from mountains of images has become a hot spot in network image processing. To recognize document images more efficiently, by analyzing the text characteristics of document images, a new method based on connected region matrix is proposed. Firstly, the connected region matrix of the binary image is obtained by image thresholding. Secondly, eightdimensional characteristics are extracted from the connected region rectangle frame matrix of the image. Finally, a BP artificial neural network is used to recognize document images. Experiments demonstrate this method leads to reduced recognition errors.
出处
《信息工程大学学报》
2012年第3期329-333,共5页
Journal of Information Engineering University
关键词
文本图像识别
图像分类
连通区域矩阵
BP神经网络
document image identification
image classification
connected region matrix
BP artifi- cial neural network