摘要
汉字骨架是汉字字形的重要拓扑描述,提供了汉字字形结构的重要信息。汉字骨架可以视为点序列,运用具有序列输出的循环神经网络(RNN)等生成。然而当前输出长序列的深度神经网络通常面临诸如梯度消失或爆炸,需要大量训练样本和训练时间长等问题,导致此类方法生成的汉字骨架蕴含的书写细节较少,同时缺乏对汉字结构的准确描述。文中将汉字的结构信息与神经网络方法相结合,使用多个并行的随机循环网络(RRN)生成汉字骨架,生成过程在两个层面进行,字符层面包括一连串的笔画,而笔画层面包括一连串的点。生成模型仅需小规模的训练数据即可完成训练,同时避免了梯度消失或爆炸的问题,不仅增强了对汉字结构的描述,而且保留了更多的骨架特征点。实验结果表明,该方法生成的汉字骨架具有更丰富的书写细节特征,可用于快速生成大规模、高质量的汉字骨架。
Chinese character skeletons are the key topological description of the Chinese character glyphs,which provides essential information about structures of the Chinese characters.The Chinese character skeleton can be regarded as a sequence of points,which can be generated by a recurrent neural network(RNN)with sequence output.However,the deep neural networks with long sequence outputs face problems such as gradient vanishing or explosion,which requires large⁃scale training data and long training time.This results in Chinese character skeletons generated by such methods containing fewer writing details and lacking an accurate description of the structure of Chinese characters.In the proposed method,the structural information of Chinese characters is combined with the neural network,and the Chinese character skeleton is generated by multiple parallel random recurrent networks(RRNs).The generation takes place at two levels,the stroke level includes a sequence of points,and the character level includes a sequence of strokes.The generative model can complete the training only with small⁃scale training data,which avoids the gradient vanishing or explosion.It not only enhances the description of the structure of Chinese characters,but also retains more skeleton features.The experimental results show that the Chinese character skeletons generated by the proposed method has richer writing details,and can be used to quickly generate large⁃scale and high⁃quality Chinese character skeletons.
作者
高奕星
施霖
GAO Yixing;SHI Lin(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处
《现代电子技术》
北大核心
2024年第1期140-146,共7页
Modern Electronics Technique
关键词
汉字骨架
生成模型
随机循环网络
序列生成
间架结构
分布式网络
Chinese character skeleton
generative model
RRN
sequence generation
frame structure
distributed network