基于基本图像特征的中朝文种辨识方法

Chinese and Korean script identification based on basic image features

下载PDF

导出

摘要提出了一种基于基本图像特征、适用于不同分辨率自然文本图像的中朝文种辨识方法.在训练阶段,首先构造标准文字图像库,然后提取文字的形态学骨架,最后利用骨架的基本图像特征来训练BP神经网络;在辨识阶段,首先对自然文本图像进行倾斜校正后做二值化处理以提高字符分割的效果,然后进行垂直投影、水平投影、字符分割和提取每个字符的骨架信息,最后基于字符骨架的基本图像特征利用BP神经网络来进行文种辨识.实验表明,本文提出的方法在中朝文种辨识中综合辨识准确率可达到87%. A Chinese and Korean script identification method was proposed in this paper based on basic image features,which is suitable for natural images with different resolutions.In the training stage：firstly,the standard text image library was constructed.Secondly,the text morphological skeleton was extracted.finally,the BP neural network was trained using the skeleton＇s basic image features.In the identifying stage：firstly,the text image binarization was carried out after tilt correction to improve the effect of character segmentation.Secondly,character segmentation were carried out after vertical projection and horizontal projection;and furthermore,the morphological skeleton was extracted.Finally,using the trained BP neural network,the script identification was implemented with the basic image features.Experiment results show that the algorithm achieved an accuracy of 87% in Chinese and Korean script identification.

作者张鹏崔荣一 ZHANG Peng CUI Rongyi(Intelligent Information Processing Lab. , Dept. of Computer Science ＆ Technology, College of Engineering, Yanbian University, Yanji 133002, China)

机构地区延边大学工学院计算机科学与技术学科智能信息处理研究室

出处《延边大学学报（自然科学版）》 CAS 2017年第2期173-178,共6页 Journal of Yanbian University（Natural Science Edition）

基金吉林省自然科学基金资助项目(20140101186JC) 国家语委科研立项基金资助项目(YB125-178)

关键词文种辨识形态学骨架基本图像特征 BP神经网络 script identification morphological skeleton basic image features BP neural network

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1顾立娟,平西建,程娟,郝玉保.一种具有旋转鲁棒性的文本图像文种识别方法[J].中国图象图形学报,2010,15(6):879-886. 被引量：4
2陆小川,伊兵哲,平西建,程娟.含噪文本图像的中英文文种识别研究[J].计算机工程与设计,2007,28(21):5150-5152. 被引量：3
3金璟璇,崔荣一,崔旭.基于小波统计特征的行块级朝汉文种辨识[J].延边大学学报（自然科学版）,2013,39(4):277-280. 被引量：2
4郭龙,平西建,周林,童莉.基本图像特征用于文本图像文种识别[J].应用科学学报,2011,29(1):56-60. 被引量：4

二级参考文献42

1王华,丁晓青,哈力木拉提.多字体多字号印刷维吾尔文字符识别[J].清华大学学报（自然科学版）,2004,44(7):946-949. 被引量：18
2Nakayama T,Spitz A L.European language determination from image[C] //Proceedings of the International Conference on Document Analysis and Recognition.Tsukuba,Japan; University of Tsukuba,1993:159-162.
3Spitz A L.Script and language determination from document images[C] //Proceedings of Third Annual Symplic Document Analysis Information Retrieval.Las Vegas,America:University of Las Vegas,1994:229-235.
4Elgammal A M,Ismail M A.Techniques for language identification for hybrid Arabic-English document images[C] //Proceedings of Sixth International Conference on Document Analysis and Recognition.Seattle,Washington DC,America:University of Seattle,2001:1100-1104.
5Ding J,Lam L,Suen C Y.Classification of oriental and European scripts by using characteristic features[C]//Proceedings of ICDAR[C].Ulm,Germany:IEEE Computer Society,1997:1023-1027.
6Pal U,Chaudhuri B B.Identification of different script lines from multi-script documents[J].Image and Vision computing,2002,20(13-14):945-954.
7Spitz A L.Determination of the script and language content of document images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(3):235-245.
8Hochberg J,Kelly P,Thomas T.Automatic script identification from images using cluster-based templates[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(2):176-181.
9Busch A,Boise W W,Sridharan S.Texture for script identification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(11):1720-1732.
10Tan T.Rotation invariant texture features and their use in automatic script identification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(7):751-756.

共引文献8

1苏金善,栾雪琴.基于Gabor滤波器的图像去噪研究[J].伊犁师范学院学报（自然科学版）,2008,2(4):15-19. 被引量：1
2徐璐,周亚建.基于特征恢复的中文残缺文本分类研究[J].北京电子科技学院学报,2011,19(4):23-29.
3王刚,靳彦青,刘立柱,储瑞来.基于多特征融合的东亚文种识别[J].计算机科学,2013,40(1):273-276. 被引量：4
4童莉,周林,平西建,徐森.基于高斯衍生滤波器组的文种识别算法[J].数据采集与处理,2014,29(5):713-719. 被引量：5
5朴明姬,崔荣一.多语种文本图像中的文字语种辨识方法的研究[J].中文信息学报,2017,31(2):220-225. 被引量：3
6金晨阳,陈英,徐丹,罗剑.基于VS番薯识别系统开发与应用[J].电脑知识与技术（过刊）,2014,20(4X):2608-2611.
7布阿加姑丽.米吉提,库尔班.吾布力,努尔毕亚.亚地卡尔,吐尔根.依不拉因,阿力木江.艾沙.纹理特征加权融合的中亚多文种文档图像文种识别[J].计算机工程与应用,2017,53(20):187-194. 被引量：4
8李顺,木特力铺.马木提,吾尔尼沙.买买提,阿力木江.艾沙,库尔班.吾布力.基于离散曲波变换的多文种文档图像文种识别[J].计算机工程与设计,2019,40(5):1376-1382. 被引量：4

1陈存弟,刘金清,刘引,蔡淑宽,何世强,周晓童,邓淑敏,吴庆祥.基于DM642的嵌入式车牌识别系统设计与实现[J].网络新媒体技术,2017,6(4):52-59. 被引量：3

延边大学学报（自然科学版）

2017年第2期

浏览历史

内容加载中请稍等...

基于基本图像特征的中朝文种辨识方法

参考文献4

二级参考文献42

共引文献8

相关作者

相关机构

相关主题

浏览历史