摘要
蒙古文的一大特点是字符无缝连接,因此一个蒙古文单词有多种字符划分方式。根据蒙古文这一特点,该文提出了多尺度蒙古文脱机手写识别方法,即让一个手写蒙古文单词图像对应多种目标序列,用多个目标序列同时约束训练模型,使得模型更加精准地学习手写图像的细节信息和蒙古文构词规则。该文提出了“十二字头”码、变形显现码和字素码3种字符划分方法,且拥有相互包含关系,即“十二字头”码可以分解为变形显现码、变形显现码可以进一步分解为字素码。多尺度模型首先用多层双向长短时记忆网络对序列化手写图像进行处理,之后加入第一层连接时序分类器做“十二字头”码序列的映射,然后是第二层连接时序分类器做变形显现码序列的映射,最后是第三层连接时序分类器做字素码序列的映射。用三个连接时序分类器损失函数的和作为模型的总损失函数。实验结果表明,该模型在公开的蒙古文脱机手写数据集MHW上表现出了最佳性能,在简单的最佳路径解码方式下,测试集Ⅰ上的单词识别准确率为66.22%、测试集Ⅱ上为63.97%。
One major feature of Mongolian is the seamless connection of characters in a word,so a Mongolian word has multiple character division methods.A multi-scale Mongolian offline handwriting recognition method is proposed,in which one image of handwritten Mongolian word are mapped into to multiple target sequences to train the model.This paper distinguishes three candidate character division methods:"Twelve Prefix"code,presentation form code and grapheme code.The multi-scale model processes the sequence of handwritten images with a Bidirectional Long Short-Term Memory network,which are then fed into a Connectionist Temporal Classification(CTC)layer to map the image to the"Twelve Prefix"code sequence,the presentation form code sequence,and the grapheme code sequence,respectively.The sum of three CTC loss is used as the total loss function of the model.The experiments show that the model achieves the best performance on the public Mongolian offline handwritten data set MHW,with 66.22%and 63.97%accuracy on test set I and II,respectively.
作者
武慧娟
范道尔吉
白凤山
滕达
潘月彩
WU Huijuan;FAN Daoerji;BAI Fengshan;Tengda;PAN Yuecai(College of Electronic Information Engineering,Inner Mongolia University,Hohhot,Inner Mongolia 010021,China)
出处
《中文信息学报》
CSCD
北大核心
2022年第10期81-87,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金(61763034)
内蒙古自治区自然科学基金(2020MS06005)。