To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spat...To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spatial Pyramid Pooling(FCN-AC-ASPP).For a picture taken by a smart phone,firstly,the image is transformed into a regular image by the dewarping algorithm.Secondly,the FCN-AC-ASPP is used to classify printed texts and handwritten texts.Lastly,handwritten texts can be removed by a simple algorithm.Experiments show that the classification accuracy of the FCN-AC-ASPP is better than FCN,DeeplabV3+,FCN-AC.For handwritten texts removal effect,the method of combining dewarping and FCN-AC-ASPP is superior to FCN-AC-ASP alone.展开更多
Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of f...Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of forms which exceeds 100 and its cursive nature.Over the past few years,good results have been obtained,but with a high cost of memory and execution time.In this paper we propose to improve the capacity of bidirectional gated recurrent unit(BGRU)to recognize Arabic text.The advantages of using BGRUs is the execution time compared to other methods that can have a high success rate but expensive in terms of time andmemory.To test the recognition capacity of BGRU,the proposed architecture is composed by 6 convolutional neural network(CNN)blocks for feature extraction and 1 BGRU+2 dense layers for learning and test.The experiment is carried out on the entire database of institut für nachrichtentechnik/ecole nationale d’ingénieurs de Tunis(IFN/ENIT)without any preprocessing or data selection.The obtained results show the ability of BGRUs to recognize handwritten Arabic script.展开更多
In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and effici...In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and efficient handwritten text recognition system remains a challenge for the research community in this field,especially for the Arabic language,which,compared to other languages,has a dearth of published works.In this work,we presented an efficient and new system for offline Arabic handwritten text recognition.Our new approach is based on the combination of a Convolutional Neural Network(CNN)and a Bidirectional Long-Term Memory(BLSTM)followed by a Connectionist Temporal Classification layer(CTC).Moreover,during the training phase of the model,we introduce an algorithm of data augmentation to increase the quality of data.Our proposed approach can recognize Arabic handwritten texts without the need to segment the characters,thus overcoming several problems related to this point.To train and test(evaluate)our approach,we used two Arabic handwritten text recognition databases,which are IFN/ENIT and KHATT.The Experimental results show that our new approach,compared to other methods in the literature,gives better results.展开更多
Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken langua...Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken language in the world.However,it is still challenging to recognize Urdu handwritten characters owing to their cursive nature.Our paper presents a Convolutional Neural Networks(CNN)model to recognize Urdu handwritten alphabet recognition(UHAR)offline and online characters.Our research contributes an Urdu handwritten dataset(aka UHDS)to empower future works in this field.For offline systems,optical readers are used for extracting the alphabets,while diagonal-based extraction methods are implemented in online systems.Moreover,our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition.To this end,we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums.Subsequently,we carried out detailed experiments for character recognition,as detailed in the results.The proposed CNN model outperformed as compared to previously published approaches.展开更多
联机连续文本识别是字符识别技术领域中新的研究方向.基于分层构筑法(Level-Building,LB)和动态时间规整算法(Dynamic Time Warping,DTW)建立了面向连续手写文本识别的手写部件识别器.将部件看作笔段和连续文本的中间模式,根据手写文本...联机连续文本识别是字符识别技术领域中新的研究方向.基于分层构筑法(Level-Building,LB)和动态时间规整算法(Dynamic Time Warping,DTW)建立了面向连续手写文本识别的手写部件识别器.将部件看作笔段和连续文本的中间模式,根据手写文本的特点建立了由484个手写部件构成的部件集.提取笔段的长度、角度等特征用于LB中每一层的DTW网格匹配中.测试样本包括6 763个汉字和303个连续手写文本.实验结果表明手写体部件集能够有效地支撑笔段和连续文本之间的联系,串识别率达到86.47%.展开更多
针对文档图像光照不均匀以及手写字符与印刷字符接近甚至粘连等问题,提出一套提取字符并区分手写体和印刷体的方案。首先提出一种基于开关映射(toggle mapping,TM)的双阈值二值化方法,用来提取非均匀光照文档图像中的字符;然后将整幅图...针对文档图像光照不均匀以及手写字符与印刷字符接近甚至粘连等问题,提出一套提取字符并区分手写体和印刷体的方案。首先提出一种基于开关映射(toggle mapping,TM)的双阈值二值化方法,用来提取非均匀光照文档图像中的字符;然后将整幅图像分割成大小相同的网格,从每个网格的邻域中提取边缘特征矩阵。由于相邻网格特征的相似性,使用了基于判别随机场(Discriminative Random Fields,DRF)的分类框架将网格分成手写体和印刷体两类。利用文本行信息的后处理获得更精细、意义更明确的分类结果。在信封邮编区域图像数据库的实验结果表明,提出的方案能够有效提取和辨别非均匀光照文档图像中粘连在一起的手写体和印刷体。另外,在IMA数据库上的实验表明,文中提出的边缘特征矩阵在辨别手写体和印刷体上的性能达到甚至超过以往文献中提出特征的性能。展开更多
手写体文本识别技术可以将手写文档转录成可编辑的数字文档。但由于手写的书写风格迥异、文档结构千变万化和字符分割识别精度不高等问题,基于神经网络的手写体英文文本识别仍面临着许多挑战。针对上述问题,提出基于卷积神经网络(CNN)和...手写体文本识别技术可以将手写文档转录成可编辑的数字文档。但由于手写的书写风格迥异、文档结构千变万化和字符分割识别精度不高等问题,基于神经网络的手写体英文文本识别仍面临着许多挑战。针对上述问题,提出基于卷积神经网络(CNN)和Transformer的手写体英文文本识别模型。首先利用CNN从输入图像中提取特征,而后将特征输入到Transformer编码器中得到特征序列每一帧的预测,最后经过链接时序分类(CTC)解码器获得最终的预测结果。在公开的IAM(Institut für Angewandte Mathematik)手写体英文单词数据集上进行了大量的实验结果表明,该模型获得了3.60%的字符错误率(CER)和12.70%的单词错误率(WER),验证了所提模型的可行性。展开更多
基金Sponsored by the Scientific Research Project of Zhejiang Provincial Department of Education(Grant No.KYY-ZX-20210329).
文摘To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spatial Pyramid Pooling(FCN-AC-ASPP).For a picture taken by a smart phone,firstly,the image is transformed into a regular image by the dewarping algorithm.Secondly,the FCN-AC-ASPP is used to classify printed texts and handwritten texts.Lastly,handwritten texts can be removed by a simple algorithm.Experiments show that the classification accuracy of the FCN-AC-ASPP is better than FCN,DeeplabV3+,FCN-AC.For handwritten texts removal effect,the method of combining dewarping and FCN-AC-ASPP is superior to FCN-AC-ASP alone.
基金This research was funded by the Deanship of the Scientific Research of the University of Ha’il,Saudi Arabia(Project:RG-20075).
文摘Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of forms which exceeds 100 and its cursive nature.Over the past few years,good results have been obtained,but with a high cost of memory and execution time.In this paper we propose to improve the capacity of bidirectional gated recurrent unit(BGRU)to recognize Arabic text.The advantages of using BGRUs is the execution time compared to other methods that can have a high success rate but expensive in terms of time andmemory.To test the recognition capacity of BGRU,the proposed architecture is composed by 6 convolutional neural network(CNN)blocks for feature extraction and 1 BGRU+2 dense layers for learning and test.The experiment is carried out on the entire database of institut für nachrichtentechnik/ecole nationale d’ingénieurs de Tunis(IFN/ENIT)without any preprocessing or data selection.The obtained results show the ability of BGRUs to recognize handwritten Arabic script.
文摘In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and efficient handwritten text recognition system remains a challenge for the research community in this field,especially for the Arabic language,which,compared to other languages,has a dearth of published works.In this work,we presented an efficient and new system for offline Arabic handwritten text recognition.Our new approach is based on the combination of a Convolutional Neural Network(CNN)and a Bidirectional Long-Term Memory(BLSTM)followed by a Connectionist Temporal Classification layer(CTC).Moreover,during the training phase of the model,we introduce an algorithm of data augmentation to increase the quality of data.Our proposed approach can recognize Arabic handwritten texts without the need to segment the characters,thus overcoming several problems related to this point.To train and test(evaluate)our approach,we used two Arabic handwritten text recognition databases,which are IFN/ENIT and KHATT.The Experimental results show that our new approach,compared to other methods in the literature,gives better results.
基金This project was funded by the Deanship of Scientific Research(DSR),King Abdul-Aziz University,Jeddah,Saudi Arabia under Grant No.(RG-11-611-43).
文摘Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken language in the world.However,it is still challenging to recognize Urdu handwritten characters owing to their cursive nature.Our paper presents a Convolutional Neural Networks(CNN)model to recognize Urdu handwritten alphabet recognition(UHAR)offline and online characters.Our research contributes an Urdu handwritten dataset(aka UHDS)to empower future works in this field.For offline systems,optical readers are used for extracting the alphabets,while diagonal-based extraction methods are implemented in online systems.Moreover,our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition.To this end,we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums.Subsequently,we carried out detailed experiments for character recognition,as detailed in the results.The proposed CNN model outperformed as compared to previously published approaches.
文摘联机连续文本识别是字符识别技术领域中新的研究方向.基于分层构筑法(Level-Building,LB)和动态时间规整算法(Dynamic Time Warping,DTW)建立了面向连续手写文本识别的手写部件识别器.将部件看作笔段和连续文本的中间模式,根据手写文本的特点建立了由484个手写部件构成的部件集.提取笔段的长度、角度等特征用于LB中每一层的DTW网格匹配中.测试样本包括6 763个汉字和303个连续手写文本.实验结果表明手写体部件集能够有效地支撑笔段和连续文本之间的联系,串识别率达到86.47%.
文摘针对文档图像光照不均匀以及手写字符与印刷字符接近甚至粘连等问题,提出一套提取字符并区分手写体和印刷体的方案。首先提出一种基于开关映射(toggle mapping,TM)的双阈值二值化方法,用来提取非均匀光照文档图像中的字符;然后将整幅图像分割成大小相同的网格,从每个网格的邻域中提取边缘特征矩阵。由于相邻网格特征的相似性,使用了基于判别随机场(Discriminative Random Fields,DRF)的分类框架将网格分成手写体和印刷体两类。利用文本行信息的后处理获得更精细、意义更明确的分类结果。在信封邮编区域图像数据库的实验结果表明,提出的方案能够有效提取和辨别非均匀光照文档图像中粘连在一起的手写体和印刷体。另外,在IMA数据库上的实验表明,文中提出的边缘特征矩阵在辨别手写体和印刷体上的性能达到甚至超过以往文献中提出特征的性能。
文摘手写体文本识别技术可以将手写文档转录成可编辑的数字文档。但由于手写的书写风格迥异、文档结构千变万化和字符分割识别精度不高等问题,基于神经网络的手写体英文文本识别仍面临着许多挑战。针对上述问题,提出基于卷积神经网络(CNN)和Transformer的手写体英文文本识别模型。首先利用CNN从输入图像中提取特征,而后将特征输入到Transformer编码器中得到特征序列每一帧的预测,最后经过链接时序分类(CTC)解码器获得最终的预测结果。在公开的IAM(Institut für Angewandte Mathematik)手写体英文单词数据集上进行了大量的实验结果表明,该模型获得了3.60%的字符错误率(CER)和12.70%的单词错误率(WER),验证了所提模型的可行性。