期刊文献+

基于基本图像特征的中朝文种辨识方法

Chinese and Korean script identification based on basic image features
下载PDF
导出
摘要 提出了一种基于基本图像特征、适用于不同分辨率自然文本图像的中朝文种辨识方法.在训练阶段,首先构造标准文字图像库,然后提取文字的形态学骨架,最后利用骨架的基本图像特征来训练BP神经网络;在辨识阶段,首先对自然文本图像进行倾斜校正后做二值化处理以提高字符分割的效果,然后进行垂直投影、水平投影、字符分割和提取每个字符的骨架信息,最后基于字符骨架的基本图像特征利用BP神经网络来进行文种辨识.实验表明,本文提出的方法在中朝文种辨识中综合辨识准确率可达到87%. A Chinese and Korean script identification method was proposed in this paper based on basic image features,which is suitable for natural images with different resolutions.In the training stage:firstly,the standard text image library was constructed.Secondly,the text morphological skeleton was extracted.finally,the BP neural network was trained using the skeleton's basic image features.In the identifying stage:firstly,the text image binarization was carried out after tilt correction to improve the effect of character segmentation.Secondly,character segmentation were carried out after vertical projection and horizontal projection;and furthermore,the morphological skeleton was extracted.Finally,using the trained BP neural network,the script identification was implemented with the basic image features.Experiment results show that the algorithm achieved an accuracy of 87% in Chinese and Korean script identification.
作者 张鹏 崔荣一 ZHANG Peng CUI Rongyi(Intelligent Information Processing Lab. , Dept. of Computer Science & Technology, College of Engineering, Yanbian University, Yanji 133002, China)
出处 《延边大学学报(自然科学版)》 CAS 2017年第2期173-178,共6页 Journal of Yanbian University(Natural Science Edition)
基金 吉林省自然科学基金资助项目(20140101186JC) 国家语委科研立项基金资助项目(YB125-178)
关键词 文种辨识 形态学骨架 基本图像特征 BP神经网络 script identification morphological skeleton basic image features BP neural network
  • 相关文献

参考文献4

二级参考文献42

  • 1王华,丁晓青,哈力木拉提.多字体多字号印刷维吾尔文字符识别[J].清华大学学报(自然科学版),2004,44(7):946-949. 被引量:18
  • 2Nakayama T,Spitz A L.European language determination from image[C] //Proceedings of the International Conference on Document Analysis and Recognition.Tsukuba,Japan; University of Tsukuba,1993:159-162.
  • 3Spitz A L.Script and language determination from document images[C] //Proceedings of Third Annual Symplic Document Analysis Information Retrieval.Las Vegas,America:University of Las Vegas,1994:229-235.
  • 4Elgammal A M,Ismail M A.Techniques for language identification for hybrid Arabic-English document images[C] //Proceedings of Sixth International Conference on Document Analysis and Recognition.Seattle,Washington DC,America:University of Seattle,2001:1100-1104.
  • 5Ding J,Lam L,Suen C Y.Classification of oriental and European scripts by using characteristic features[C]//Proceedings of ICDAR[C].Ulm,Germany:IEEE Computer Society,1997:1023-1027.
  • 6Pal U,Chaudhuri B B.Identification of different script lines from multi-script documents[J].Image and Vision computing,2002,20(13-14):945-954.
  • 7Spitz A L.Determination of the script and language content of document images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(3):235-245.
  • 8Hochberg J,Kelly P,Thomas T.Automatic script identification from images using cluster-based templates[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(2):176-181.
  • 9Busch A,Boise W W,Sridharan S.Texture for script identification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(11):1720-1732.
  • 10Tan T.Rotation invariant texture features and their use in automatic script identification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(7):751-756.

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部