期刊文献+

基于中朝统一IDS编码的朝鲜语古籍文字识别方法

Korean ancient books character recognition method based on unified Chinese and Korean characters ideographic description sequences coding
下载PDF
导出
摘要 为解决朝鲜语古籍中的中文和朝鲜文字混排的识别难题,提出一种中朝文字的表意文字描述序列(IDS)统一编码方案,旨在通过利用偏旁分解字符识别模型(CCR-CLIP)识别朝鲜语古籍文字.首先,根据中朝文字结构的相似性,对文字中出现的汉字偏旁、朝鲜文字字母和12种基本结构进行了统一编码;其次,通过加入朝鲜文字的IDS序列扩充了CCR-CLIP原模型中提供的汉字的IDS序列文件;最后,通过在训练阶段使用印刷体文字训练的方式解决了朝鲜语古籍样本少的问题. In order to solve the problem of recognition of mixed Chinese and Korean characters in ancient Korean books,this paper proposes a unified ideographic description sequence(IDS)encoding scheme for Chinese and Korean characters,which aims to recognize ancient Korean books by using a side decomposition chinese character recognition-contrastive language–image pre-training(CCR-CLIP).Firstly,according to the similarity of Chinese and Korean characters,the Chinese characters’side edges,Korean characters’letters and 12 kinds of basic structures are uniformly coded.Secondly,the IDS sequence file of Chinese characters provided in the original model of CCR-CLIP is extended by adding IDS sequence of Korean characters.Finally,the problem of few samples of Korean ancient books was solved by using printed characters in the training stage.The results show that compared with the CCR-SLD method,the character recognition accuracy of this method is improved by 13.8%in the experiment of Korean ancient books.In the printed text experiment,the accuracy of character recognition improved by 5.38%.The established method is better than other methods in solving the problem of Korean ancient text recognition,and can provide reference for solving the problem of Korean ancient text recognition.
作者 赵梦玲 金小峰 ZHAO Mengling;JIN Xiaofeng(College of Integration Science,Yanbian University,Yanji 133002,China)
出处 《延边大学学报(自然科学版)》 CAS 2024年第2期101-106,共6页 Journal of Yanbian University(Natural Science Edition)
基金 吉林省教育厅人文社科基础研究项目(JJKH20230608SK)。
关键词 朝鲜语古籍 零样本 文字识别 文字编码 表意文字描述序列 Korean ancient books zero-shot character recognition character coding ideographic description sequences
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部