期刊文献+

基于锚点的字符级甲骨图像自动标注算法研究 被引量:2

Research on Automatic Annotation Algorithm for Character-level Oracle-Bone Images Based on Anchor Points
下载PDF
导出
摘要 甲骨文是中国最早的系统文字,是目前能见到的最早的成熟汉字.甲骨文的研究对历史探究和文化传承具有重要的意义.但是要实现字符级别的甲骨字符图像标注,在现有技术环境下,只能通过资深甲骨学专家进行人工标注,不仅耗费人力资源,而且效率低下.针对这一问题,在前期工作中的甲骨字符图像识别模型的基础上,本文提出了一种甲骨字符图像自动标注算法.该算法通过先分列后切割的思想,先将甲骨拓片上的每一个字符图像归结到某一个特定列,再以锚点甲骨字为参考点,根据空间近邻关系找到甲骨原文中的字所对应的甲骨字符图像,从而实现了甲骨字符图像的自动标注.同时,将标注好的甲骨字符图像添加到样本数据集,并利用增广后的数据集(增加6~10倍)重新训练甲骨字符图像识别模型,有利于提高基于深度学习的甲骨文识别算法的识别准确度;以较小的成本大幅增加样本数量,也可以节约专家大量的时间和人力. Oracle-Bone inscriptions are the earliest systematic and mature Chinese characters presently discovered.The study of Oracle-Bone inscriptions is of great significance to historical exploration and cultural inheritance.However,in order to realize character-level Oracle-Bone image annotation,in the existing technical environment,only experienced experts in Oracle-Bone inscriptions can carry out manual annotation,which not only consumes human resources,but also is inefficient.Aiming at this problem,based on the Oracle-Bone image recognition model in the previous work,this paper proposes an automatic annotation algorithm for Oracle-Bone character images.In this algorithm,each character image on the Oracle-Bone rubbings is first reduced to a specific column.Then,the Oracle-Bone character images corresponding to the characters in the original text are found by taking the anchor point as the reference point and according to the nearest neighbor relation of space,so as to realize the automatic labeling of the Oracle-Bone character images.At the same time,the labeled Oracle-Bone images are added to the sample data set,and the original Oracle-Bone character image recognition model is retrained by using the augmented data set(6-10 times increase),which is conducive to improve the recognition accuracy of the Oracle-Bone character recognition algorithm based on deep learning.In this way,the number of samples can be greatly increased at a small cost,and a lot of time and manpower of experts can be saved.
作者 史先进 曹爽 张重生 陶月锋 吕灵灵 沈夏炯 SHI Xian-jin;CAO Shuang;ZHANG Chong-sheng;TAO Yue-feng;LÜ Ling-ling;SHEN Xia-jiong(School of Computer and Information Engineering,Laboratory of the Yellow River Cultural Heritage,Henan University,Kaifeng,Henan 475004,China;Henan Electrochemical Education Center,Zhengzhou,Henan 450004,China;School of Electric Power,North China University of Water Conservancy and Hydropower,Zhengzhou,Henan 450045,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2021年第10期2020-2031,共12页 Acta Electronica Sinica
关键词 甲骨文 图像标注 数据增广 锚点 空间近邻 模式识别 Oracle-Bone inscriptions image annotation data augmentation anchor point spatial neighbor pattern recognition
  • 相关文献

参考文献9

二级参考文献41

共引文献199

同被引文献64

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部