Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format fo...Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset.展开更多
The context of recognizing handwritten city names,this research addresses the challenges posed by the manual inscription of Bangladeshi city names in the Bangla script.In today’s technology-driven era,where precise t...The context of recognizing handwritten city names,this research addresses the challenges posed by the manual inscription of Bangladeshi city names in the Bangla script.In today’s technology-driven era,where precise tools for reading handwritten text are essential,this study focuses on leveraging deep learning to understand the intricacies of Bangla handwriting.The existing dearth of dedicated datasets has impeded the progress of Bangla handwritten city name recognition systems,particularly in critical areas such as postal automation and document processing.Notably,no prior research has specifically targeted the unique needs of Bangla handwritten city name recognition.To bridge this gap,the study collects real-world images from diverse sources to construct a comprehensive dataset for Bangla Hand Written City name recognition.The emphasis on practical data for system training enhances accuracy.The research further conducts a comparative analysis,pitting state-of-the-art(SOTA)deep learning models,including EfficientNetB0,VGG16,ResNet50,DenseNet201,InceptionV3,and Xception,against a custom Convolutional Neural Networks(CNN)model named“Our CNN.”The results showcase the superior performance of“Our CNN,”with a test accuracy of 99.97% and an outstanding F1 score of 99.95%.These metrics underscore its potential for automating city name recognition,particularly in postal services.The study concludes by highlighting the significance of meticulous dataset curation and the promising outlook for custom CNN architectures.It encourages future research avenues,including dataset expansion,algorithm refinement,exploration of recurrent neural networks and attention mechanisms,real-world deployment of models,and extension to other regional languages and scripts.These recommendations offer exciting possibilities for advancing the field of handwritten recognition technology and hold practical implications for enhancing global postal services.展开更多
The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study intro...The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.展开更多
Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.T...Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.展开更多
Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases wa...Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases was confined.Almost a quarter of a billion people worldwide write and speak Arabic.More historical books and files indicate a vital data set for many Arab nationswritten in Arabic.Recently,Arabic handwritten character recognition(AHCR)has grabbed the attention and has become a difficult topic for pattern recognition and computer vision(CV).Therefore,this study develops fireworks optimizationwith the deep learning-based AHCR(FWODL-AHCR)technique.Themajor intention of the FWODL-AHCR technique is to recognize the distinct handwritten characters in the Arabic language.It initially pre-processes the handwritten images to improve their quality of them.Then,the RetinaNet-based deep convolutional neural network is applied as a feature extractor to produce feature vectors.Next,the deep echo state network(DESN)model is utilized to classify handwritten characters.Finally,the FWO algorithm is exploited as a hyperparameter tuning strategy to boost recognition performance.Various simulations in series were performed to exhibit the enhanced performance of the FWODL-AHCR technique.The comparison study portrayed the supremacy of the FWODL-AHCR technique over other approaches,with 99.91%and 98.94%on Hijja and AHCD datasets,respectively.展开更多
To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spat...To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spatial Pyramid Pooling(FCN-AC-ASPP).For a picture taken by a smart phone,firstly,the image is transformed into a regular image by the dewarping algorithm.Secondly,the FCN-AC-ASPP is used to classify printed texts and handwritten texts.Lastly,handwritten texts can be removed by a simple algorithm.Experiments show that the classification accuracy of the FCN-AC-ASPP is better than FCN,DeeplabV3+,FCN-AC.For handwritten texts removal effect,the method of combining dewarping and FCN-AC-ASPP is superior to FCN-AC-ASP alone.展开更多
Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researcher...Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researchers recently.To realize the automatic grading of handwritten chemistry assignments,the problem of chemical notations recognition should be solved first.The recent handwritten chemical notations recognition solutions belonging to the end-to-end trainable category suffered fromthe problem of lacking the accurate alignment information between the input and output.They serve the aim of reading notations into electrical devices to better prepare relevant edocuments instead of auto-grading handwritten assignments.To tackle this limitation to enable the auto-grading of handwritten chemistry assignments at a fine-grained level.In this work,we propose a component-detectionbased approach for recognizing off-line handwritten Organic Cyclic Compound Structure Formulas(OCCSFs).Specifically,we define different components of OCCSFs as objects(including graphical objects and text objects),and adopt the deep learning detector to detect them.Then,regarding the detected text objects,we introduce an improved attention-based encoder-decoder model for text recognition.Finally,with these detection results and the geometric relationships of detected objects,this article designs a holistic algorithm for interpreting the spatial structure of handwritten OCCSFs.The proposedmethod is evaluated on a self-collected data set consisting of 3000 samples and achieves promising results.展开更多
A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed docume...A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts.展开更多
Deep metric learning is one of the recommended methods for the challenge of supporting few/zero-shot learning by deep networks.It depends on building a Siamese architecture of two homogeneous Convolutional Neural Netw...Deep metric learning is one of the recommended methods for the challenge of supporting few/zero-shot learning by deep networks.It depends on building a Siamese architecture of two homogeneous Convolutional Neural Networks(CNNs)for learning a distance function that can map input data from the input space to the feature space.Instead of determining the class of each sample,the Siamese architecture deals with the existence of a few training samples by deciding if the samples share the same class identity or not.The traditional structure for the Siamese architecture was built by forming two CNNs from scratch with randomly initialized weights and trained by binary cross-entropy loss.Building two CNNs from scratch is a trial and error and time-consuming phase.In addition,training with binary crossentropy loss sometimes leads to poor margins.In this paper,a novel Siamese network is proposed and applied to few/zero-shot Handwritten Character Recognition(HCR)tasks.The novelties of the proposed network are in.1)Utilizing transfer learning and using the pre-trained AlexNet as a feature extractor in the Siamese architecture.Fine-tuning a pre-trained network is typically faster and easier than building from scratch.2)Training the Siamese architecture with contrastive loss instead of the binary cross-entropy.Contrastive loss helps the network to learn a nonlinear mapping function that enables it to map the extracted features in the vector space with an optimal way.The proposed network is evaluated on the challenging Chars74K datasets by conducting two experiments.One is for testing the proposed network in few-shot learning while the other is for testing it in zero-shot learning.The recognition accuracy of the proposed network reaches to 85.6%and 82%in few-and zero-shot learning respectively.In addition,a comparison between the performance of the proposed Siamese network and the traditional Siamese CNNs is conducted.The comparison results show that the proposed network achieves higher recognition results in less time.The proposed network reduces the training time from days to hours in both experiments.展开更多
In this paper,Modified Multi-scale Segmentation Network(MMU-SNet)method is proposed for Tamil text recognition.Handwritten texts from digi-tal writing pad notes are used for text recognition.Handwritten words recognit...In this paper,Modified Multi-scale Segmentation Network(MMU-SNet)method is proposed for Tamil text recognition.Handwritten texts from digi-tal writing pad notes are used for text recognition.Handwritten words recognition for texts written from digital writing pad through text file conversion are challen-ging due to stylus pressure,writing on glass frictionless surfaces,and being less skilled in short writing,alphabet size,style,carved symbols,and orientation angle variations.Stylus pressure on the pad changes the words in the Tamil language alphabet because the Tamil alphabets have a smaller number of lines,angles,curves,and bends.The small change in dots,curves,and bends in the Tamil alphabet leads to error in recognition and changes the meaning of the words because of wrong alphabet conversion.However,handwritten English word recognition and conversion of text files from a digital writing pad are performed through various algorithms such as Support Vector Machine(SVM),Kohonen Neural Network(KNN),and Convolutional Neural Network(CNN)for offline and online alphabet recognition.The proposed algorithms are compared with above algorithms for Tamil word recognition.The proposed MMU-SNet method has achieved good accuracy in predicting text,about 96.8%compared to other traditional CNN algorithms.展开更多
To improve the recognition accuracy of off-line handwritten Tibetan characters the local gradient direction histograms based on the wavelet transform are proposed as the recognition features.First for a Tibetan charac...To improve the recognition accuracy of off-line handwritten Tibetan characters the local gradient direction histograms based on the wavelet transform are proposed as the recognition features.First for a Tibetan character sample image the first level approximation component of the Haar wavelet transform is calculated.Secondly the approximation component is partitioned into several equal-sized zones. Finally the gradient direction histograms of each zone are calculated and the local direction histograms of the approximation component are considered as the features of the character sample image.The proposed method is tested on the recently developed off-line Tibetan handwritten character sample database.The experimental results demonstrate the effectiveness and efficiency of the proposed feature extraction method.Furthermore compared with the detail components the approximation component contributes more to the recognition accuracy.展开更多
We present a ghost handwritten digit recognition method for the unknown handwritten digits based on ghost imaging(GI)with deep neural network,where a few detection signals from the bucket detector,generated by the cos...We present a ghost handwritten digit recognition method for the unknown handwritten digits based on ghost imaging(GI)with deep neural network,where a few detection signals from the bucket detector,generated by the cosine transform speckle,are used as the characteristic information and the input of the designed deep neural network(DNN),and the output of the DNN is the classification.The results show that the proposed scheme has a higher recognition accuracy(as high as 98%for the simulations,and 91%for the experiments)with a smaller sampling ratio(say 12.76%).With the increase of the sampling ratio,the recognition accuracy is enhanced.Compared with the traditional recognition scheme using the same DNN structure,the proposed scheme has slightly better performance with a lower complexity and non-locality property.The proposed scheme provides a promising way for remote sensing.展开更多
Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of f...Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of forms which exceeds 100 and its cursive nature.Over the past few years,good results have been obtained,but with a high cost of memory and execution time.In this paper we propose to improve the capacity of bidirectional gated recurrent unit(BGRU)to recognize Arabic text.The advantages of using BGRUs is the execution time compared to other methods that can have a high success rate but expensive in terms of time andmemory.To test the recognition capacity of BGRU,the proposed architecture is composed by 6 convolutional neural network(CNN)blocks for feature extraction and 1 BGRU+2 dense layers for learning and test.The experiment is carried out on the entire database of institut für nachrichtentechnik/ecole nationale d’ingénieurs de Tunis(IFN/ENIT)without any preprocessing or data selection.The obtained results show the ability of BGRUs to recognize handwritten Arabic script.展开更多
Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken langua...Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken language in the world.However,it is still challenging to recognize Urdu handwritten characters owing to their cursive nature.Our paper presents a Convolutional Neural Networks(CNN)model to recognize Urdu handwritten alphabet recognition(UHAR)offline and online characters.Our research contributes an Urdu handwritten dataset(aka UHDS)to empower future works in this field.For offline systems,optical readers are used for extracting the alphabets,while diagonal-based extraction methods are implemented in online systems.Moreover,our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition.To this end,we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums.Subsequently,we carried out detailed experiments for character recognition,as detailed in the results.The proposed CNN model outperformed as compared to previously published approaches.展开更多
In practice, retraining a trained classifier is necessary when novel data become available. This paper adopts an incremental learning procedure to adaptively train a Kernel-based Nonlinear Representor (KNR), a recentl...In practice, retraining a trained classifier is necessary when novel data become available. This paper adopts an incremental learning procedure to adaptively train a Kernel-based Nonlinear Representor (KNR), a recently presented nonlinear classifier for optimal pattern representation, so that its generalization ability may be evaluated in time-variant situation and a sparser representation is obtained for computationally intensive tasks. The addressed techniques are applied to handwritten digit classification to illustrate the feasibility for pattern recognition.展开更多
Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the adv...Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the advent of new datasets like the Extended Modified National Institute of Standards and Technology dataset(EMNIST).The EMNIST dataset represents a challenge for both machine-learning and deep-learning techniques due to inter-class similarity and intra-class variability.Inter-class similarity exists because of the similarity between the shapes of certain characters in the dataset.The presence of intra-class variability is mainly due to different shapes written by different writers for the same character.In this research,we have optimized a deep residual network to achieve higher accuracy vs.the published state-of-the-art results.This approach is mainly based on the prebuilt deep residual network model ResNet18,whose architecture has been enhanced by using the optimal number of residual blocks and the optimal size of the receptive field of the first convolutional filter,the replacement of the first max-pooling filter by an average pooling filter,and the addition of a drop-out layer before the fully connected layer.A distinctive modification has been introduced by replacing the final addition layer with a depth concatenation layer,which resulted in a novel deep architecture having higher accuracy vs.the pure residual architecture.Moreover,the dataset images’sizes have been adjusted to optimize their visibility in the network.Finally,by tuning the training hyperparameters and using rotation and shear augmentations,the proposed model outperformed the state-of-the-art models by achieving average accuracies of 95.91%and 90.90%for the Letters and Balanced dataset sections,respectively.Furthermore,the average accuracies were improved to 95.9%and 91.06%for the Letters and Balanced sections,respectively,by using a group of 5 instances of the trained models and averaging the output class probabilities.展开更多
Using Support Vector Machine(SVM)requires the selection of several parameters such as multi-class strategy type(one-against-all or one-against-one),the regularization parameter C,kernel function and their parameters.T...Using Support Vector Machine(SVM)requires the selection of several parameters such as multi-class strategy type(one-against-all or one-against-one),the regularization parameter C,kernel function and their parameters.The choice of these parameters has a great influence on the performance of the final classifier.This paper considers the grid search method and the particle swarm optimization(PSO)technique that have allowed to quickly select and scan a large space of SVM parameters.A comparative study of the SVM models is also presented to examine the convergence speed and the results of each model.SVM is applied to handwritten Arabic characters learning,with a database containing 4840 Arabic characters in their different positions(isolated,beginning,middle and end).Some very promising results have been achieved.展开更多
The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instru...The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instrument for the digitalization and attribution of old text to other authors of historic studies,including old national and religious archives.In this study,we proposed a new affective segmentation model by modifying an artificial neural network model and making it suitable for the binarization stage based on blocks.This modified method is combined with a new effective rotation model to achieve an accurate segmentation through the analysis of the histogram of binary images.Also,propose a new framework for correct text rotation that will help us to establish a segmentation method that can facilitate the extraction of text from its background.Image projections and the radon transform are used and improved using machine learning based on a co-occurrence matrix to produce binary images.The training stage involves taking a number of images for model training.These images are selected randomly with different angles to generate four classes(0–90,90–180,180–270,and 270–360).The proposed segmentation approach achieves a high accuracy of 98.18%.The study ultimately provides two major contributions that are ranked from top to bottom according to the degree of importance.The proposed method can be further developed as a new application and used in the recognition of handwritten Arabic text from small documents regardless of logical combinations and sentence construction.展开更多
This paper presents a cascaded Hidden Markov Model (HMM), which allows state's transition, skip and duration. The cascaded HMM extends the way of HMM pattern description of Handwritten Chinese Character (HCC) and...This paper presents a cascaded Hidden Markov Model (HMM), which allows state's transition, skip and duration. The cascaded HMM extends the way of HMM pattern description of Handwritten Chinese Character (HCC) and depicts the behavior of handwritten curve more reliably in terms of the statistic probability. Hence character segmentation and labeling are unnecessary. Viterbi algorithm is integrated in the cascaded HMM after the whole sample sequence of a HCC is input. More than 26,000 component samples are used tor training 407 handwritten component HMMs. At the improved training stage 94 models of 94 Chinese characters are gained by 32,000 samples, Compared with the Segment HMMs approach, the recognition rate of this model tier the tirst candidate is 87.89% and the error rate could be reduced by 12.4%.展开更多
The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes ...The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes classifier based on decision tree inductive learning algorithm ID3 for handwritten Chinese characters is presented. With a feature extraction controller, the classifier can reduce the number of extracted features and accelerate classification speed. Experimental results show that the 4-corner codes classifier performs well on both recognition accuracy and speed.展开更多
文摘Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset.
基金MMU Postdoctoral and Research Fellow(Account:MMUI/230023.02).
文摘The context of recognizing handwritten city names,this research addresses the challenges posed by the manual inscription of Bangladeshi city names in the Bangla script.In today’s technology-driven era,where precise tools for reading handwritten text are essential,this study focuses on leveraging deep learning to understand the intricacies of Bangla handwriting.The existing dearth of dedicated datasets has impeded the progress of Bangla handwritten city name recognition systems,particularly in critical areas such as postal automation and document processing.Notably,no prior research has specifically targeted the unique needs of Bangla handwritten city name recognition.To bridge this gap,the study collects real-world images from diverse sources to construct a comprehensive dataset for Bangla Hand Written City name recognition.The emphasis on practical data for system training enhances accuracy.The research further conducts a comparative analysis,pitting state-of-the-art(SOTA)deep learning models,including EfficientNetB0,VGG16,ResNet50,DenseNet201,InceptionV3,and Xception,against a custom Convolutional Neural Networks(CNN)model named“Our CNN.”The results showcase the superior performance of“Our CNN,”with a test accuracy of 99.97% and an outstanding F1 score of 99.95%.These metrics underscore its potential for automating city name recognition,particularly in postal services.The study concludes by highlighting the significance of meticulous dataset curation and the promising outlook for custom CNN architectures.It encourages future research avenues,including dataset expansion,algorithm refinement,exploration of recurrent neural networks and attention mechanisms,real-world deployment of models,and extension to other regional languages and scripts.These recommendations offer exciting possibilities for advancing the field of handwritten recognition technology and hold practical implications for enhancing global postal services.
文摘The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.
文摘Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR39.
文摘Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases was confined.Almost a quarter of a billion people worldwide write and speak Arabic.More historical books and files indicate a vital data set for many Arab nationswritten in Arabic.Recently,Arabic handwritten character recognition(AHCR)has grabbed the attention and has become a difficult topic for pattern recognition and computer vision(CV).Therefore,this study develops fireworks optimizationwith the deep learning-based AHCR(FWODL-AHCR)technique.Themajor intention of the FWODL-AHCR technique is to recognize the distinct handwritten characters in the Arabic language.It initially pre-processes the handwritten images to improve their quality of them.Then,the RetinaNet-based deep convolutional neural network is applied as a feature extractor to produce feature vectors.Next,the deep echo state network(DESN)model is utilized to classify handwritten characters.Finally,the FWO algorithm is exploited as a hyperparameter tuning strategy to boost recognition performance.Various simulations in series were performed to exhibit the enhanced performance of the FWODL-AHCR technique.The comparison study portrayed the supremacy of the FWODL-AHCR technique over other approaches,with 99.91%and 98.94%on Hijja and AHCD datasets,respectively.
基金Sponsored by the Scientific Research Project of Zhejiang Provincial Department of Education(Grant No.KYY-ZX-20210329).
文摘To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spatial Pyramid Pooling(FCN-AC-ASPP).For a picture taken by a smart phone,firstly,the image is transformed into a regular image by the dewarping algorithm.Secondly,the FCN-AC-ASPP is used to classify printed texts and handwritten texts.Lastly,handwritten texts can be removed by a simple algorithm.Experiments show that the classification accuracy of the FCN-AC-ASPP is better than FCN,DeeplabV3+,FCN-AC.For handwritten texts removal effect,the method of combining dewarping and FCN-AC-ASPP is superior to FCN-AC-ASP alone.
基金supported by National Natural Science Foundation of China (Nos.62007014 and 62177024)the Humanities and Social Sciences Youth Fund of the Ministry of Education (No.20YJC880024)+1 种基金China Post Doctoral Science Foundation (No.2019M652678)the Fundamental Research Funds for the Central Universities (No.CCNU20ZT019).
文摘Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researchers recently.To realize the automatic grading of handwritten chemistry assignments,the problem of chemical notations recognition should be solved first.The recent handwritten chemical notations recognition solutions belonging to the end-to-end trainable category suffered fromthe problem of lacking the accurate alignment information between the input and output.They serve the aim of reading notations into electrical devices to better prepare relevant edocuments instead of auto-grading handwritten assignments.To tackle this limitation to enable the auto-grading of handwritten chemistry assignments at a fine-grained level.In this work,we propose a component-detectionbased approach for recognizing off-line handwritten Organic Cyclic Compound Structure Formulas(OCCSFs).Specifically,we define different components of OCCSFs as objects(including graphical objects and text objects),and adopt the deep learning detector to detect them.Then,regarding the detected text objects,we introduce an improved attention-based encoder-decoder model for text recognition.Finally,with these detection results and the geometric relationships of detected objects,this article designs a holistic algorithm for interpreting the spatial structure of handwritten OCCSFs.The proposedmethod is evaluated on a self-collected data set consisting of 3000 samples and achieves promising results.
基金This research was supported and funded by KAU Scientific Endowment,King Abdulaziz University,Jeddah,Saudi Arabia.
文摘A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts.
文摘Deep metric learning is one of the recommended methods for the challenge of supporting few/zero-shot learning by deep networks.It depends on building a Siamese architecture of two homogeneous Convolutional Neural Networks(CNNs)for learning a distance function that can map input data from the input space to the feature space.Instead of determining the class of each sample,the Siamese architecture deals with the existence of a few training samples by deciding if the samples share the same class identity or not.The traditional structure for the Siamese architecture was built by forming two CNNs from scratch with randomly initialized weights and trained by binary cross-entropy loss.Building two CNNs from scratch is a trial and error and time-consuming phase.In addition,training with binary crossentropy loss sometimes leads to poor margins.In this paper,a novel Siamese network is proposed and applied to few/zero-shot Handwritten Character Recognition(HCR)tasks.The novelties of the proposed network are in.1)Utilizing transfer learning and using the pre-trained AlexNet as a feature extractor in the Siamese architecture.Fine-tuning a pre-trained network is typically faster and easier than building from scratch.2)Training the Siamese architecture with contrastive loss instead of the binary cross-entropy.Contrastive loss helps the network to learn a nonlinear mapping function that enables it to map the extracted features in the vector space with an optimal way.The proposed network is evaluated on the challenging Chars74K datasets by conducting two experiments.One is for testing the proposed network in few-shot learning while the other is for testing it in zero-shot learning.The recognition accuracy of the proposed network reaches to 85.6%and 82%in few-and zero-shot learning respectively.In addition,a comparison between the performance of the proposed Siamese network and the traditional Siamese CNNs is conducted.The comparison results show that the proposed network achieves higher recognition results in less time.The proposed network reduces the training time from days to hours in both experiments.
文摘In this paper,Modified Multi-scale Segmentation Network(MMU-SNet)method is proposed for Tamil text recognition.Handwritten texts from digi-tal writing pad notes are used for text recognition.Handwritten words recognition for texts written from digital writing pad through text file conversion are challen-ging due to stylus pressure,writing on glass frictionless surfaces,and being less skilled in short writing,alphabet size,style,carved symbols,and orientation angle variations.Stylus pressure on the pad changes the words in the Tamil language alphabet because the Tamil alphabets have a smaller number of lines,angles,curves,and bends.The small change in dots,curves,and bends in the Tamil alphabet leads to error in recognition and changes the meaning of the words because of wrong alphabet conversion.However,handwritten English word recognition and conversion of text files from a digital writing pad are performed through various algorithms such as Support Vector Machine(SVM),Kohonen Neural Network(KNN),and Convolutional Neural Network(CNN)for offline and online alphabet recognition.The proposed algorithms are compared with above algorithms for Tamil word recognition.The proposed MMU-SNet method has achieved good accuracy in predicting text,about 96.8%compared to other traditional CNN algorithms.
基金The National Natural Science Foundation of China(No.60963016)the National Social Science Foundation of China(No.17BXW037)
文摘To improve the recognition accuracy of off-line handwritten Tibetan characters the local gradient direction histograms based on the wavelet transform are proposed as the recognition features.First for a Tibetan character sample image the first level approximation component of the Haar wavelet transform is calculated.Secondly the approximation component is partitioned into several equal-sized zones. Finally the gradient direction histograms of each zone are calculated and the local direction histograms of the approximation component are considered as the features of the character sample image.The proposed method is tested on the recently developed off-line Tibetan handwritten character sample database.The experimental results demonstrate the effectiveness and efficiency of the proposed feature extraction method.Furthermore compared with the detail components the approximation component contributes more to the recognition accuracy.
基金the National Natural Science Foundation of China(Grant Nos.61871234 and 11847062).
文摘We present a ghost handwritten digit recognition method for the unknown handwritten digits based on ghost imaging(GI)with deep neural network,where a few detection signals from the bucket detector,generated by the cosine transform speckle,are used as the characteristic information and the input of the designed deep neural network(DNN),and the output of the DNN is the classification.The results show that the proposed scheme has a higher recognition accuracy(as high as 98%for the simulations,and 91%for the experiments)with a smaller sampling ratio(say 12.76%).With the increase of the sampling ratio,the recognition accuracy is enhanced.Compared with the traditional recognition scheme using the same DNN structure,the proposed scheme has slightly better performance with a lower complexity and non-locality property.The proposed scheme provides a promising way for remote sensing.
基金This research was funded by the Deanship of the Scientific Research of the University of Ha’il,Saudi Arabia(Project:RG-20075).
文摘Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of forms which exceeds 100 and its cursive nature.Over the past few years,good results have been obtained,but with a high cost of memory and execution time.In this paper we propose to improve the capacity of bidirectional gated recurrent unit(BGRU)to recognize Arabic text.The advantages of using BGRUs is the execution time compared to other methods that can have a high success rate but expensive in terms of time andmemory.To test the recognition capacity of BGRU,the proposed architecture is composed by 6 convolutional neural network(CNN)blocks for feature extraction and 1 BGRU+2 dense layers for learning and test.The experiment is carried out on the entire database of institut für nachrichtentechnik/ecole nationale d’ingénieurs de Tunis(IFN/ENIT)without any preprocessing or data selection.The obtained results show the ability of BGRUs to recognize handwritten Arabic script.
基金This project was funded by the Deanship of Scientific Research(DSR),King Abdul-Aziz University,Jeddah,Saudi Arabia under Grant No.(RG-11-611-43).
文摘Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken language in the world.However,it is still challenging to recognize Urdu handwritten characters owing to their cursive nature.Our paper presents a Convolutional Neural Networks(CNN)model to recognize Urdu handwritten alphabet recognition(UHAR)offline and online characters.Our research contributes an Urdu handwritten dataset(aka UHDS)to empower future works in this field.For offline systems,optical readers are used for extracting the alphabets,while diagonal-based extraction methods are implemented in online systems.Moreover,our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition.To this end,we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums.Subsequently,we carried out detailed experiments for character recognition,as detailed in the results.The proposed CNN model outperformed as compared to previously published approaches.
基金Supported by the Key Project of Chinese Ministry of Education (No.105150).
文摘In practice, retraining a trained classifier is necessary when novel data become available. This paper adopts an incremental learning procedure to adaptively train a Kernel-based Nonlinear Representor (KNR), a recently presented nonlinear classifier for optimal pattern representation, so that its generalization ability may be evaluated in time-variant situation and a sparser representation is obtained for computationally intensive tasks. The addressed techniques are applied to handwritten digit classification to illustrate the feasibility for pattern recognition.
文摘Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the advent of new datasets like the Extended Modified National Institute of Standards and Technology dataset(EMNIST).The EMNIST dataset represents a challenge for both machine-learning and deep-learning techniques due to inter-class similarity and intra-class variability.Inter-class similarity exists because of the similarity between the shapes of certain characters in the dataset.The presence of intra-class variability is mainly due to different shapes written by different writers for the same character.In this research,we have optimized a deep residual network to achieve higher accuracy vs.the published state-of-the-art results.This approach is mainly based on the prebuilt deep residual network model ResNet18,whose architecture has been enhanced by using the optimal number of residual blocks and the optimal size of the receptive field of the first convolutional filter,the replacement of the first max-pooling filter by an average pooling filter,and the addition of a drop-out layer before the fully connected layer.A distinctive modification has been introduced by replacing the final addition layer with a depth concatenation layer,which resulted in a novel deep architecture having higher accuracy vs.the pure residual architecture.Moreover,the dataset images’sizes have been adjusted to optimize their visibility in the network.Finally,by tuning the training hyperparameters and using rotation and shear augmentations,the proposed model outperformed the state-of-the-art models by achieving average accuracies of 95.91%and 90.90%for the Letters and Balanced dataset sections,respectively.Furthermore,the average accuracies were improved to 95.9%and 91.06%for the Letters and Balanced sections,respectively,by using a group of 5 instances of the trained models and averaging the output class probabilities.
文摘Using Support Vector Machine(SVM)requires the selection of several parameters such as multi-class strategy type(one-against-all or one-against-one),the regularization parameter C,kernel function and their parameters.The choice of these parameters has a great influence on the performance of the final classifier.This paper considers the grid search method and the particle swarm optimization(PSO)technique that have allowed to quickly select and scan a large space of SVM parameters.A comparative study of the SVM models is also presented to examine the convergence speed and the results of each model.SVM is applied to handwritten Arabic characters learning,with a database containing 4840 Arabic characters in their different positions(isolated,beginning,middle and end).Some very promising results have been achieved.
文摘The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instrument for the digitalization and attribution of old text to other authors of historic studies,including old national and religious archives.In this study,we proposed a new affective segmentation model by modifying an artificial neural network model and making it suitable for the binarization stage based on blocks.This modified method is combined with a new effective rotation model to achieve an accurate segmentation through the analysis of the histogram of binary images.Also,propose a new framework for correct text rotation that will help us to establish a segmentation method that can facilitate the extraction of text from its background.Image projections and the radon transform are used and improved using machine learning based on a co-occurrence matrix to produce binary images.The training stage involves taking a number of images for model training.These images are selected randomly with different angles to generate four classes(0–90,90–180,180–270,and 270–360).The proposed segmentation approach achieves a high accuracy of 98.18%.The study ultimately provides two major contributions that are ranked from top to bottom according to the degree of importance.The proposed method can be further developed as a new application and used in the recognition of handwritten Arabic text from small documents regardless of logical combinations and sentence construction.
文摘This paper presents a cascaded Hidden Markov Model (HMM), which allows state's transition, skip and duration. The cascaded HMM extends the way of HMM pattern description of Handwritten Chinese Character (HCC) and depicts the behavior of handwritten curve more reliably in terms of the statistic probability. Hence character segmentation and labeling are unnecessary. Viterbi algorithm is integrated in the cascaded HMM after the whole sample sequence of a HCC is input. More than 26,000 component samples are used tor training 407 handwritten component HMMs. At the improved training stage 94 models of 94 Chinese characters are gained by 32,000 samples, Compared with the Segment HMMs approach, the recognition rate of this model tier the tirst candidate is 87.89% and the error rate could be reduced by 12.4%.
文摘The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes classifier based on decision tree inductive learning algorithm ID3 for handwritten Chinese characters is presented. With a feature extraction controller, the classifier can reduce the number of extracted features and accelerate classification speed. Experimental results show that the 4-corner codes classifier performs well on both recognition accuracy and speed.