摘要
During the last two decades signicant work has been reported in the eld of cursive language’s recognition especially,in the Arabic,the Urdu and the Persian languages.The unavailability of such work in the Pashto language is because of:the absence of a standard database and of signicant research work that ultimately acts as a big barrier for the research community.The slight change in the Pashto characters’shape is an additional challenge for researchers.This paper presents an efcient OCR system for the handwritten Pashto characters based on multi-class enabled support vector machine using manifold feature extraction techniques.These feature extraction techniques include,tools such as zoning feature extractor,discrete cosine transform,discrete wavelet transform,and Gabor lters and histogram of oriented gradients.A hybrid feature map is developed by combining the manifold feature maps.This research work is performed by developing a medium-sized dataset of handwritten Pashto characters that encapsulate 200 handwritten samples for each 44 characters in the Pashto language.Recognition results are generated for the proposed model based on a manifold and hybrid feature map.An overall accuracy rates of 63.30%,65.13%,68.55%,68.28%,67.02%and 83%are generated based on a zoning technique,HoGs,Gabor lter,DCT,DWT and hybrid feature maps respectively.Applicability of the proposed model is also tested by comparing its results with a convolution neural network model.The convolution neural network-based model generated an accuracy rate of 81.02%smaller than the multi-class support vector machine.The highest accuracy rate of 83%for the multi-class SVM model based on a hybrid feature map reects the applicability of the proposed model.
基金
funded by Qatar University Internal Grant under Grant No.IRCC-2020-009.The ndings achieved herein are solely the responsibility of the authors。