摘要
提出了一种基于基本图像特征、适用于不同分辨率自然文本图像的中朝文种辨识方法.在训练阶段,首先构造标准文字图像库,然后提取文字的形态学骨架,最后利用骨架的基本图像特征来训练BP神经网络;在辨识阶段,首先对自然文本图像进行倾斜校正后做二值化处理以提高字符分割的效果,然后进行垂直投影、水平投影、字符分割和提取每个字符的骨架信息,最后基于字符骨架的基本图像特征利用BP神经网络来进行文种辨识.实验表明,本文提出的方法在中朝文种辨识中综合辨识准确率可达到87%.
A Chinese and Korean script identification method was proposed in this paper based on basic image features,which is suitable for natural images with different resolutions.In the training stage:firstly,the standard text image library was constructed.Secondly,the text morphological skeleton was extracted.finally,the BP neural network was trained using the skeleton's basic image features.In the identifying stage:firstly,the text image binarization was carried out after tilt correction to improve the effect of character segmentation.Secondly,character segmentation were carried out after vertical projection and horizontal projection;and furthermore,the morphological skeleton was extracted.Finally,using the trained BP neural network,the script identification was implemented with the basic image features.Experiment results show that the algorithm achieved an accuracy of 87% in Chinese and Korean script identification.
作者
张鹏
崔荣一
ZHANG Peng CUI Rongyi(Intelligent Information Processing Lab. , Dept. of Computer Science & Technology, College of Engineering, Yanbian University, Yanji 133002, China)
出处
《延边大学学报(自然科学版)》
CAS
2017年第2期173-178,共6页
Journal of Yanbian University(Natural Science Edition)
基金
吉林省自然科学基金资助项目(20140101186JC)
国家语委科研立项基金资助项目(YB125-178)
关键词
文种辨识
形态学骨架
基本图像特征
BP神经网络
script identification
morphological skeleton
basic image features
BP neural network