摘要
手写体识别技术作为自动阅卷的关键一环受到广泛研究。针对中文手写文本字迹复杂的问题,提出一种文本定位和识别的手写汉字文本识别方法。在文本定位信息中使用透视变化纠正倾斜的文本,特征提取阶段使用注意力多分支卷积层提取文本图像关键区域特征以及多尺度特征融合,语义提取阶段通过时间卷积网络和Transformer编码器构建序列信息和建模上下文语义信息,最后以链接时序分类函数,实现序列特征和字符序列标签对齐。所提方法在公开数据集CASIA-HWDB上进行实验,结果表明,注意力分支卷积层和语义提取层有效提升算法性能,证明所提方法的可行性。
The handwriting recognition technology has been widely studied as a key part of the automatic paper marking.A handwritten Chinese text recognition method for text localization and recognition is proposed for the problem of complex handwriting of Chinese handwritten text.The text localization information is corrected by using perspective change for skewed text,followed by feature extraction stage using attentional multi-branch convolutional layer to extract key region features of text images and multi-scale feature fusion,semantic extraction stage by constructing sequence information and modeling contextual semantic information through temporal convolutional network and Transformer encoder,and finally by connecting temporal classification functions to achieve sequence features and character sequence label alignment.The proposed method is investigated using the publicly available dataset CASIA-HWDB,and the results show that the attention branching convolutional layer and the semantic extraction layer can effectively improve the algorithm performance,which verifies the feasibility of the proposed method.
作者
郑晓旭
舒珊珊
文成玉
ZHENG Xiaoxu;SHU Shanshan;WEN Chengyu(College of Communicating Engineering,Chengdu University of Information Technology,Chengdu 610225,China)
出处
《成都信息工程大学学报》
2023年第6期649-655,共7页
Journal of Chengdu University of Information Technology