Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.T...Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.展开更多
This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingl...This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.展开更多
In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English l...In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English language for handwritten character recognition but very less work on the under-resourced Hindi language.A method is developed for identifying Hindi language characters that use morphology,edge detection,histograms of oriented gradients(HOG),and SVM classes for summary creation.SVM rank employs the summary to extract essential phrases based on paragraph position,phrase position,numerical data,inverted comma,sentence length,and keywords features.The primary goal of the SVM optimization function is to reduce the number of features by eliminating unnecessary and redundant features.The second goal is to maintain or improve the classification system’s performance.The experiment included news articles from various genres,such as Bollywood,politics,and sports.The proposed method’s accuracy for Hindi character recognition is 96.97%,which is good compared with baseline approaches,and system-generated summaries are compared to human summaries.The evaluated results show a precision of 72%at a compression ratio of 50%and a precision of 60%at a compression ratio of 25%,in comparison to state-of-the-art methods,this is a decent result.展开更多
目的在影像归档和通信系统(Picture Archiving and Communication System,PACS)数据库文件丢失或损坏后,实现影像资料和PDF报告关键信息的快速识别和重组,供患者回诊使用。方法利用基于深度学习的光学字符识别技术和Pydicom技术分别读取...目的在影像归档和通信系统(Picture Archiving and Communication System,PACS)数据库文件丢失或损坏后,实现影像资料和PDF报告关键信息的快速识别和重组,供患者回诊使用。方法利用基于深度学习的光学字符识别技术和Pydicom技术分别读取PDF和DCOM文件中的基本信息,重新建立起患者、影像、报告三者之间的联系,并将关联数据写入数据库。结果经抽样验证,该方法识别同类图像精度的准确度、精准度及召回率均为100%,综合指标F1值为1,在不同组别独立样本间的识别精度表现出一致性。平均每份报告识别时间约为0.14 s(t=-1.005,P=0.315),说明不同组别独立样本间的识别时间表现出一致性。结论该方法的使用能有效缩短数据库故障后患者等待时长,能够在短时间内恢复医疗秩序,可用于PACS数据库数据丢失后的应急处置,也为PACS的数据整合提供依据,为医学影像数据恢复和数据整合提供一种新思路。展开更多
目前通信机房图片归档,人工操作占据了主导地位,然而这种方式存在效率低、易出错等缺陷。在此背景下,文章提出了一种基于光学字符识别(Optical Character Recognition,OCR)模型的通信机房图片归档系统。该系统通过自动识别图片中的文字...目前通信机房图片归档,人工操作占据了主导地位,然而这种方式存在效率低、易出错等缺陷。在此背景下,文章提出了一种基于光学字符识别(Optical Character Recognition,OCR)模型的通信机房图片归档系统。该系统通过自动识别图片中的文字信息,分析图片所属的机房位置,进而按照机柜位置分类归档图片,实现自动化管理。经过测试,该系统的归档准确率达到了98%以上,显著提高了通信机房图片归档的效率。展开更多
在对互感器铭牌图像进行扫描输入时,铭牌图像或多或少会出现一定程度的倾斜,这种图像的倾斜最终会导致其字符识别准确率下降。针对此问题提出一种基于霍夫变换获取图像倾斜角度,进而通过图像旋转矫正提高光学字符识别(Optical Character...在对互感器铭牌图像进行扫描输入时,铭牌图像或多或少会出现一定程度的倾斜,这种图像的倾斜最终会导致其字符识别准确率下降。针对此问题提出一种基于霍夫变换获取图像倾斜角度,进而通过图像旋转矫正提高光学字符识别(Optical Character Recognition,OCR)准确率的方法:首先对原始图像进行二值化,进而获得铭牌的轮廓,再采用基于霍夫变换的方法获得铭牌中的水平线段,通过计算得到线段的水平倾斜角,利用此倾角对图像进行还原。实验结果表明,该方法能快速地计算图像的倾斜角度,提高了OCR识别准确率且准确率可达95%以上。展开更多
传统光学字符识别(Optical Character Recognition,OCR)方法一般只提取图像亮度特征,在图像退化较严重时识别准确率不高。针对这一问题,提出一种新的扫描字符特征提取方法。除各通道亮度外,还提取像素位置、亮度的一阶导、二阶导等特征...传统光学字符识别(Optical Character Recognition,OCR)方法一般只提取图像亮度特征,在图像退化较严重时识别准确率不高。针对这一问题,提出一种新的扫描字符特征提取方法。除各通道亮度外,还提取像素位置、亮度的一阶导、二阶导等特征构成特征图像,并根据各个特征对图像的贡献程度进行加权处理。计算以当前像素为中心的局部区域特征图像块的协方差矩阵作为当前像素的描述子,然后在黎曼空间对字符实施分类。实验结果表明,采用典型的结构化分类器时,该特征提取方法对字符识别的准确率高于传统方法,表现出较强的鲁棒性。展开更多
This paper briefly introduces the main ideas of a sustainable development OCR system based on open architecture techniques and then describes the construction of an optical character recognition (OCR) center built on ...This paper briefly introduces the main ideas of a sustainable development OCR system based on open architecture techniques and then describes the construction of an optical character recognition (OCR) center built on computer clusters, for the purpose of dynamically improving the recognition precision of the digitized texts of a million volumes of books produced by the China-US Million Books Digital Library (CADAL) Project. The practice of this center will provide helpful reference for other digital library projects.展开更多
本文介绍了某轨道交通制造企业文档识别项目的建设背景、设计方案以及项目建设过程,利用先进的光学字符识别(Optical Character Recognition,OCR)技术提升了纸质文档管理效率,并总结和建立了企业的OCR平台的通用集成标准规范,使各业务...本文介绍了某轨道交通制造企业文档识别项目的建设背景、设计方案以及项目建设过程,利用先进的光学字符识别(Optical Character Recognition,OCR)技术提升了纸质文档管理效率,并总结和建立了企业的OCR平台的通用集成标准规范,使各业务系统所需录入文档能够在短时间内识别完成,并集成于OCR平台进行存储和管控,进而提升处理效率,在企业的研发、生产以及运营环节节省更多人力。展开更多
The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the...The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats.展开更多
The increasing capabilities of Artificial Intelligence(AI),has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans,which m...The increasing capabilities of Artificial Intelligence(AI),has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans,which may not always have a positive impact on the society.AI gone rogue,and Technological Singularity are major concerns in academia as well as the industry.It is necessary to identify the limitations of machines and analyze their incompetence,which could draw a line between human and machine intelligence.Internet memes are an amalgam of pictures,videos,underlying messages,ideas,sentiments,humor,and experiences,hence the way an internet meme is perceived by a human may not be entirely how a machine comprehends it.In this paper,we present experimental evidence on how comprehending Internet Memes is a challenge for AI.We use a combination of Optical Character Recognition techniques like Tesseract,Pixel Link,and East Detector to extract text from the memes,and machine learning algorithms like Convolutional Neural Networks(CNN),Region-based Convolutional Neural Networks(RCNN),and Transfer Learning with pre-trained denseNet for assessing the textual and facial emotions combined.We evaluate the performance using Sensitivity and Specificity.Our results show that comprehending memes is indeed a challenging task,and hence a major limitation of AI.This research would be of utmost interest to researchers working in the areas of Artificial General Intelligence and Technological Singularity.展开更多
文摘Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.
文摘This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.
文摘In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English language for handwritten character recognition but very less work on the under-resourced Hindi language.A method is developed for identifying Hindi language characters that use morphology,edge detection,histograms of oriented gradients(HOG),and SVM classes for summary creation.SVM rank employs the summary to extract essential phrases based on paragraph position,phrase position,numerical data,inverted comma,sentence length,and keywords features.The primary goal of the SVM optimization function is to reduce the number of features by eliminating unnecessary and redundant features.The second goal is to maintain or improve the classification system’s performance.The experiment included news articles from various genres,such as Bollywood,politics,and sports.The proposed method’s accuracy for Hindi character recognition is 96.97%,which is good compared with baseline approaches,and system-generated summaries are compared to human summaries.The evaluated results show a precision of 72%at a compression ratio of 50%and a precision of 60%at a compression ratio of 25%,in comparison to state-of-the-art methods,this is a decent result.
文摘目的在影像归档和通信系统(Picture Archiving and Communication System,PACS)数据库文件丢失或损坏后,实现影像资料和PDF报告关键信息的快速识别和重组,供患者回诊使用。方法利用基于深度学习的光学字符识别技术和Pydicom技术分别读取PDF和DCOM文件中的基本信息,重新建立起患者、影像、报告三者之间的联系,并将关联数据写入数据库。结果经抽样验证,该方法识别同类图像精度的准确度、精准度及召回率均为100%,综合指标F1值为1,在不同组别独立样本间的识别精度表现出一致性。平均每份报告识别时间约为0.14 s(t=-1.005,P=0.315),说明不同组别独立样本间的识别时间表现出一致性。结论该方法的使用能有效缩短数据库故障后患者等待时长,能够在短时间内恢复医疗秩序,可用于PACS数据库数据丢失后的应急处置,也为PACS的数据整合提供依据,为医学影像数据恢复和数据整合提供一种新思路。
文摘目前通信机房图片归档,人工操作占据了主导地位,然而这种方式存在效率低、易出错等缺陷。在此背景下,文章提出了一种基于光学字符识别(Optical Character Recognition,OCR)模型的通信机房图片归档系统。该系统通过自动识别图片中的文字信息,分析图片所属的机房位置,进而按照机柜位置分类归档图片,实现自动化管理。经过测试,该系统的归档准确率达到了98%以上,显著提高了通信机房图片归档的效率。
文摘在对互感器铭牌图像进行扫描输入时,铭牌图像或多或少会出现一定程度的倾斜,这种图像的倾斜最终会导致其字符识别准确率下降。针对此问题提出一种基于霍夫变换获取图像倾斜角度,进而通过图像旋转矫正提高光学字符识别(Optical Character Recognition,OCR)准确率的方法:首先对原始图像进行二值化,进而获得铭牌的轮廓,再采用基于霍夫变换的方法获得铭牌中的水平线段,通过计算得到线段的水平倾斜角,利用此倾角对图像进行还原。实验结果表明,该方法能快速地计算图像的倾斜角度,提高了OCR识别准确率且准确率可达95%以上。
文摘传统光学字符识别(Optical Character Recognition,OCR)方法一般只提取图像亮度特征,在图像退化较严重时识别准确率不高。针对这一问题,提出一种新的扫描字符特征提取方法。除各通道亮度外,还提取像素位置、亮度的一阶导、二阶导等特征构成特征图像,并根据各个特征对图像的贡献程度进行加权处理。计算以当前像素为中心的局部区域特征图像块的协方差矩阵作为当前像素的描述子,然后在黎曼空间对字符实施分类。实验结果表明,采用典型的结构化分类器时,该特征提取方法对字符识别的准确率高于传统方法,表现出较强的鲁棒性。
基金Project supported by China-US Million Books Digital Library Project
文摘This paper briefly introduces the main ideas of a sustainable development OCR system based on open architecture techniques and then describes the construction of an optical character recognition (OCR) center built on computer clusters, for the purpose of dynamically improving the recognition precision of the digitized texts of a million volumes of books produced by the China-US Million Books Digital Library (CADAL) Project. The practice of this center will provide helpful reference for other digital library projects.
文摘本文介绍了某轨道交通制造企业文档识别项目的建设背景、设计方案以及项目建设过程,利用先进的光学字符识别(Optical Character Recognition,OCR)技术提升了纸质文档管理效率,并总结和建立了企业的OCR平台的通用集成标准规范,使各业务系统所需录入文档能够在短时间内识别完成,并集成于OCR平台进行存储和管控,进而提升处理效率,在企业的研发、生产以及运营环节节省更多人力。
文摘The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats.
文摘The increasing capabilities of Artificial Intelligence(AI),has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans,which may not always have a positive impact on the society.AI gone rogue,and Technological Singularity are major concerns in academia as well as the industry.It is necessary to identify the limitations of machines and analyze their incompetence,which could draw a line between human and machine intelligence.Internet memes are an amalgam of pictures,videos,underlying messages,ideas,sentiments,humor,and experiences,hence the way an internet meme is perceived by a human may not be entirely how a machine comprehends it.In this paper,we present experimental evidence on how comprehending Internet Memes is a challenge for AI.We use a combination of Optical Character Recognition techniques like Tesseract,Pixel Link,and East Detector to extract text from the memes,and machine learning algorithms like Convolutional Neural Networks(CNN),Region-based Convolutional Neural Networks(RCNN),and Transfer Learning with pre-trained denseNet for assessing the textual and facial emotions combined.We evaluate the performance using Sensitivity and Specificity.Our results show that comprehending memes is indeed a challenging task,and hence a major limitation of AI.This research would be of utmost interest to researchers working in the areas of Artificial General Intelligence and Technological Singularity.