This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingl...This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.展开更多
OpticalMark Recognition(OMR)systems have been studied since 1970.It is widely accepted as a data entry technique.OMR technology is used for surveys and multiple-choice questionnaires.Due to its ease of use,OMR technol...OpticalMark Recognition(OMR)systems have been studied since 1970.It is widely accepted as a data entry technique.OMR technology is used for surveys and multiple-choice questionnaires.Due to its ease of use,OMR technology has grown in popularity over the past two decades and is widely used in universities and colleges to automatically grade and grade student responses to questionnaires.The accuracy of OMR systems is very important due to the environment inwhich they are used.TheOMRalgorithm relies on pixel projection or Hough transform to determine the exact answer in the document.These techniques rely on majority voting to approximate a predetermined shape.The performance of these systems depends on precise input from dedicated hardware.Printing and scanning OMR tables introduces artifacts that make table processing error-prone.This observation is a fundamental limitation of traditional pixel projection and Hough transform techniques.Depending on the type of artifact introduced,accuracy is affected differently.We classified the types of errors and their frequency according to the artifacts in the OMR system.As a major contribution,we propose an improved algorithm that fixes errors due to skewness.Our proposal is based on the Hough transform for improving the accuracy of bias correction mechanisms in OMR documents.As a minor contribution,our proposal also improves the accuracy of detecting markers in OMR documents.The results show an improvement in accuracy over existing algorithms in each of the identified problems.This improvement increases confidence in OMR document processing and increases efficiency when using automated OMR document processing.展开更多
Optical Character Recognition(OCR)refers to a technology that uses image processing technology and character recognition algorithms to identify characters on an image.This paper is a deep study on the recognition effe...Optical Character Recognition(OCR)refers to a technology that uses image processing technology and character recognition algorithms to identify characters on an image.This paper is a deep study on the recognition effect of OCR based on Artificial Intelligence(AI)algorithms,in which the different AI algorithms for OCR analysis are classified and reviewed.Firstly,the mechanisms and characteristics of artificial neural network-based OCR are summarized.Secondly,this paper explores machine learning-based OCR,and draws the conclusion that the algorithms available for this form of OCR are still in their infancy,with low generalization and fixed recognition errors,albeit with better recognition effect and higher recognition accuracy.Finally,this paper explores several of the latest algorithms such as deep learning and pattern recognition algorithms.This paper concludes that OCR requires algorithms with higher recognition accuracy.展开更多
Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.T...Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.展开更多
At present, the demand for perimeter security system is in-creasing greatly, especially for such system based on distribut-ed optical fiber sensing. This paper proposes a perimeter se-curity monitoring system based on...At present, the demand for perimeter security system is in-creasing greatly, especially for such system based on distribut-ed optical fiber sensing. This paper proposes a perimeter se-curity monitoring system based on phase-sensitive coherentoptical time domain reflectometry(Ф-COTDR) with the practi-cal pattern recognition function. We use fast Fourier trans-form(FFT) to exact features from intrusion events and a multi-class classification algorithm derived from support vector ma-chine(SVM) to work as a pattern recognition technique. Fivedifferent types of events are classified by using a classifica-tion algorithm based on SVM through a three-dimensional fea-ture vector. Moreover, the identification results of the patternrecognition system show that an identification accurate rate of92.62% on average can be achieved.展开更多
Based on a comprehensive study of various algorithms, the automatic recognition of traditional ocular optical measuring instruments is realized. Taking a universal tools microscope(UTM) lens view image as an example, ...Based on a comprehensive study of various algorithms, the automatic recognition of traditional ocular optical measuring instruments is realized. Taking a universal tools microscope(UTM) lens view image as an example, a 2-layer automatic recognition model for data reading is established after adopting a series of pre-processing algorithms. This model is an optimal combination of the correlation-based template matching method and a concurrent back propagation(BP) neural network. Multiple complementary feature extraction is used in generating the eigenvectors of the concurrent network. In order to improve fault-tolerance capacity, rotation invariant features based on Zernike moments are extracted from digit characters and a 4-dimensional group of the outline features is also obtained. Moreover, the operating time and reading accuracy can be adjusted dy-namically by setting the threshold value. The experimental result indicates that the newly developed algorithm has optimal recognition precision and working speed. The average reading ratio can achieve 97.23%. The recognition method can automatically obtain the results of optical measuring instruments rapidly and stably without modifying their original structure, which meets the application requirements.展开更多
In the process of human behavior recognition, the traditional dense optical flow method has too many pixels and too much overhead, which limits the running speed. This paper proposed a method combing YOLOv3 (You Only ...In the process of human behavior recognition, the traditional dense optical flow method has too many pixels and too much overhead, which limits the running speed. This paper proposed a method combing YOLOv3 (You Only Look Once v3) and local optical flow method. Based on the dense optical flow method, the optical flow modulus of the area where the human target is detected is calculated to reduce the amount of computation and save the cost in terms of time. And then, a threshold value is set to complete the human behavior identification. Through design algorithm, experimental verification and other steps, the walking, running and falling state of human body in real life indoor sports video was identified. Experimental results show that this algorithm is more advantageous for jogging behavior recognition.展开更多
In this paper a simple scheme for optical implementation of human-face recognition with only an incoherent optical correlator is presented. The system uses complementary-encoding hit-or-miss transform method to improv...In this paper a simple scheme for optical implementation of human-face recognition with only an incoherent optical correlator is presented. The system uses complementary-encoding hit-or-miss transform method to improve the performance of the standard correlator. According to this method, a compact optical system for human-face optical recognition is bult up. In the face library 200 photographs are stored and the recognition speed of the system is 10 frames per second. The accuracy of recognition is more than 90 percent. The system has good fault-tolerance ability for the pictures with rotation distortion, Gauss noise disturbance or information losing.展开更多
In this paper human face machine identification is experienced using optical correlation techniques in spatial frequency domain. This approach is tested on ORL dataset of faces which includes face images of 40 subject...In this paper human face machine identification is experienced using optical correlation techniques in spatial frequency domain. This approach is tested on ORL dataset of faces which includes face images of 40 subjects, each in 10 different positions. The examined optical setup relies on optical correlation based on developing optical Vanderlugt filters and its basics are described in this article. With the limitation of face database of 40 persons, the recognition is examined successfully with nearly 100% of accuracy in matching the input images with their respective Vanderlugt synthesized filters. Software simulation is implemented by using MATLAB for face identification.展开更多
The dexterous hand is equiped with the flexible fiber as the optic sensor for recognition and identification of objects structured and non-structured environment.This simple and inexpensive method for object recogniti...The dexterous hand is equiped with the flexible fiber as the optic sensor for recognition and identification of objects structured and non-structured environment.This simple and inexpensive method for object recognition based on the optical fiber is presented in this paper.展开更多
The purpose of the paper is to develop a mobile Android application--"Car Log" that gives to users the ability to track all the costs for a vehicle and the ability to add fuel cost data by taking a photo of the cash...The purpose of the paper is to develop a mobile Android application--"Car Log" that gives to users the ability to track all the costs for a vehicle and the ability to add fuel cost data by taking a photo of the cash receipt from the respective gas station where the charging was performed. OCR (optical character recognition) is the conversion of images of typed, handwritten or printed text into machine-encoded text. Once we have the text machine-encoded we can further use it in machine processes, like translation, or extracted, meaning text-to-speech transformed, helping people in simple everyday tasks. Users of the application will be able to enter other completely different costs grouped into categories and other charges. Car Log application quickly and easily can visualize, edit and add different costs for a ear. It also supports the ability to add multiple profiles, by entering data for all ears in a single family, for example, or a small business. The test results are positive thus we intend to further develop a cloud ready application.展开更多
Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remain...Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.展开更多
The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characte...The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.展开更多
The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the...The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats.展开更多
The two-stream convolutional neural network exhibits excellent performance in the video action recognition.The crux of the matter is to use the frames already clipped by the videos and the optical flow images pre-extr...The two-stream convolutional neural network exhibits excellent performance in the video action recognition.The crux of the matter is to use the frames already clipped by the videos and the optical flow images pre-extracted by the frames,to train a model each,and to finally integrate the outputs of the two models.Nevertheless,the reliance on the pre-extraction of the optical flow impedes the efficiency of action recognition,and the temporal and the spatial streams are just simply fused at the ends,with one stream failing and the other stream succeeding.We propose a novel hidden two-stream collaborative(HTSC)learning network that masks the steps of extracting the optical flow in the network and greatly speeds up the action recognition.Based on the two-stream method,the two-stream collaborative learning model captures the interaction of the temporal and spatial features to greatly enhance the accuracy of recognition.Our proposed method is highly capable of achieving the balance of efficiency and precision on large-scale video action recognition datasets.展开更多
In this paper, the author analyzes characteristics and extracting method of interference signal of the distributed optical fiber sensing. In the distributed optical fiber sensing, realizing alarm and positioning funct...In this paper, the author analyzes characteristics and extracting method of interference signal of the distributed optical fiber sensing. In the distributed optical fiber sensing, realizing alarm and positioning function only through the cross-correlation operation will increase the load of the system, can make misinformation rate of the system be improved greatly. Therefore, before the localization algorithm, adding a interference signal feature recognition is very necessary, can reduce unnecessary operation loss and reduce the load of the system, also reducing the number of the false positives.展开更多
Braille-assistive technologies have helped blind people to write,read,learn,and communicate with sighted individuals for many years.These technologies enable blind people to engage with society and help break down com...Braille-assistive technologies have helped blind people to write,read,learn,and communicate with sighted individuals for many years.These technologies enable blind people to engage with society and help break down communication barriers in their lives.The Optical Braille Recognition(OBR)system is one example of these technologies.It plays an important role in facilitating communication between sighted and blind people and assists sighted individuals in the reading and understanding of the documents of Braille cells.However,a clear gap exists in current OBR systems regarding asymmetric multilingual conversion of Braille documents.Few systems allow sighted people to read and understand Braille documents for self-learning applications.In this study,we propose a deep learning-based approach to convert Braille images into multilingual texts.This is achieved through a set of effective steps that start with image acquisition and preprocessing and end with a Braille multilingual mapping step.We develop a deep convolutional neural network(DCNN)model that takes its inputs from the second step of the approach for recognizing Braille cells.Several experiments are conducted on two datasets of Braille images to evaluate the performance of the DCNN model.The rst dataset contains 1,404 labeled images of 27 Braille symbols representing the alphabet characters.The second dataset consists of 5,420 labeled images of 37 Braille symbols that represent alphabet characters,numbers,and punctuation.The proposed model achieved a classication accuracy of 99.28%on the test set of the rst dataset and 98.99%on the test set of the second dataset.These results conrm the applicability of the DCNN model used in our proposed approach for multilingual Braille conversion in communicating with sighted people.展开更多
Automated recognition of a person is one of the most critical issues in the modern society. Common biometric systems rely on the surface topography of an object and, thus, are potentially vulnerable for spoofing. Opti...Automated recognition of a person is one of the most critical issues in the modern society. Common biometric systems rely on the surface topography of an object and, thus, are potentially vulnerable for spoofing. Optical coherence tomography is a technology that has the capability to probe the internal structure of multilayered tissues. The paper describes an algorithm for automation fingerprint recognition that the algorithm is applied on the OCT fingerprint images. This algorithm is based on scanning of the enhanced and segmented OCT images.展开更多
文摘This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.
基金King Saud University for funding this work through Researchers Supporting Project number(RSP2022R426).
文摘OpticalMark Recognition(OMR)systems have been studied since 1970.It is widely accepted as a data entry technique.OMR technology is used for surveys and multiple-choice questionnaires.Due to its ease of use,OMR technology has grown in popularity over the past two decades and is widely used in universities and colleges to automatically grade and grade student responses to questionnaires.The accuracy of OMR systems is very important due to the environment inwhich they are used.TheOMRalgorithm relies on pixel projection or Hough transform to determine the exact answer in the document.These techniques rely on majority voting to approximate a predetermined shape.The performance of these systems depends on precise input from dedicated hardware.Printing and scanning OMR tables introduces artifacts that make table processing error-prone.This observation is a fundamental limitation of traditional pixel projection and Hough transform techniques.Depending on the type of artifact introduced,accuracy is affected differently.We classified the types of errors and their frequency according to the artifacts in the OMR system.As a major contribution,we propose an improved algorithm that fixes errors due to skewness.Our proposal is based on the Hough transform for improving the accuracy of bias correction mechanisms in OMR documents.As a minor contribution,our proposal also improves the accuracy of detecting markers in OMR documents.The results show an improvement in accuracy over existing algorithms in each of the identified problems.This improvement increases confidence in OMR document processing and increases efficiency when using automated OMR document processing.
基金supported by science and technology projects of Gansu State Grid Corporation of China(52272220002U).
文摘Optical Character Recognition(OCR)refers to a technology that uses image processing technology and character recognition algorithms to identify characters on an image.This paper is a deep study on the recognition effect of OCR based on Artificial Intelligence(AI)algorithms,in which the different AI algorithms for OCR analysis are classified and reviewed.Firstly,the mechanisms and characteristics of artificial neural network-based OCR are summarized.Secondly,this paper explores machine learning-based OCR,and draws the conclusion that the algorithms available for this form of OCR are still in their infancy,with low generalization and fixed recognition errors,albeit with better recognition effect and higher recognition accuracy.Finally,this paper explores several of the latest algorithms such as deep learning and pattern recognition algorithms.This paper concludes that OCR requires algorithms with higher recognition accuracy.
文摘Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.
文摘At present, the demand for perimeter security system is in-creasing greatly, especially for such system based on distribut-ed optical fiber sensing. This paper proposes a perimeter se-curity monitoring system based on phase-sensitive coherentoptical time domain reflectometry(Ф-COTDR) with the practi-cal pattern recognition function. We use fast Fourier trans-form(FFT) to exact features from intrusion events and a multi-class classification algorithm derived from support vector ma-chine(SVM) to work as a pattern recognition technique. Fivedifferent types of events are classified by using a classifica-tion algorithm based on SVM through a three-dimensional fea-ture vector. Moreover, the identification results of the patternrecognition system show that an identification accurate rate of92.62% on average can be achieved.
文摘Based on a comprehensive study of various algorithms, the automatic recognition of traditional ocular optical measuring instruments is realized. Taking a universal tools microscope(UTM) lens view image as an example, a 2-layer automatic recognition model for data reading is established after adopting a series of pre-processing algorithms. This model is an optimal combination of the correlation-based template matching method and a concurrent back propagation(BP) neural network. Multiple complementary feature extraction is used in generating the eigenvectors of the concurrent network. In order to improve fault-tolerance capacity, rotation invariant features based on Zernike moments are extracted from digit characters and a 4-dimensional group of the outline features is also obtained. Moreover, the operating time and reading accuracy can be adjusted dy-namically by setting the threshold value. The experimental result indicates that the newly developed algorithm has optimal recognition precision and working speed. The average reading ratio can achieve 97.23%. The recognition method can automatically obtain the results of optical measuring instruments rapidly and stably without modifying their original structure, which meets the application requirements.
文摘In the process of human behavior recognition, the traditional dense optical flow method has too many pixels and too much overhead, which limits the running speed. This paper proposed a method combing YOLOv3 (You Only Look Once v3) and local optical flow method. Based on the dense optical flow method, the optical flow modulus of the area where the human target is detected is calculated to reduce the amount of computation and save the cost in terms of time. And then, a threshold value is set to complete the human behavior identification. Through design algorithm, experimental verification and other steps, the walking, running and falling state of human body in real life indoor sports video was identified. Experimental results show that this algorithm is more advantageous for jogging behavior recognition.
基金Supported by the High Technology Research and Development Programme of ChinaCao Guangbiao High Technology Foundation
文摘In this paper a simple scheme for optical implementation of human-face recognition with only an incoherent optical correlator is presented. The system uses complementary-encoding hit-or-miss transform method to improve the performance of the standard correlator. According to this method, a compact optical system for human-face optical recognition is bult up. In the face library 200 photographs are stored and the recognition speed of the system is 10 frames per second. The accuracy of recognition is more than 90 percent. The system has good fault-tolerance ability for the pictures with rotation distortion, Gauss noise disturbance or information losing.
文摘In this paper human face machine identification is experienced using optical correlation techniques in spatial frequency domain. This approach is tested on ORL dataset of faces which includes face images of 40 subjects, each in 10 different positions. The examined optical setup relies on optical correlation based on developing optical Vanderlugt filters and its basics are described in this article. With the limitation of face database of 40 persons, the recognition is examined successfully with nearly 100% of accuracy in matching the input images with their respective Vanderlugt synthesized filters. Software simulation is implemented by using MATLAB for face identification.
文摘The dexterous hand is equiped with the flexible fiber as the optic sensor for recognition and identification of objects structured and non-structured environment.This simple and inexpensive method for object recognition based on the optical fiber is presented in this paper.
文摘The purpose of the paper is to develop a mobile Android application--"Car Log" that gives to users the ability to track all the costs for a vehicle and the ability to add fuel cost data by taking a photo of the cash receipt from the respective gas station where the charging was performed. OCR (optical character recognition) is the conversion of images of typed, handwritten or printed text into machine-encoded text. Once we have the text machine-encoded we can further use it in machine processes, like translation, or extracted, meaning text-to-speech transformed, helping people in simple everyday tasks. Users of the application will be able to enter other completely different costs grouped into categories and other charges. Car Log application quickly and easily can visualize, edit and add different costs for a ear. It also supports the ability to add multiple profiles, by entering data for all ears in a single family, for example, or a small business. The test results are positive thus we intend to further develop a cloud ready application.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.
基金This work is supported by Natural Science Foundation of China(Grant No.61903056)Major Project of Science and Technology Research Program of Chongqing Education Commission of China(Grant No.KJZDM201900601)+3 种基金Chongqing Research Program of Basic Research and Frontier Technology(Grant Nos.cstc2019jcyj-msxmX0681,cstc2021jcyj-msxmX0530,and cstc2021jcyjmsxmX0761)Project Supported by Chongqing Municipal Key Laboratory of Institutions of Higher Education(Grant No.cqupt-mct-201901)Project Supported by Chongqing Key Laboratory of Mobile Communications Technology(Grant No.cqupt-mct-202002)Project Supported by Engineering Research Center of Mobile Communications,Ministry of Education(Grant No.cqupt-mct202006)。
文摘The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.
文摘The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats.
基金This work was supported by the Scientific Research Fund of Hunan Provincial Education Department of China(Project No.17A007)the Teaching Reform and Research Project of Hunan Province of China(Project No.JG1615).
文摘The two-stream convolutional neural network exhibits excellent performance in the video action recognition.The crux of the matter is to use the frames already clipped by the videos and the optical flow images pre-extracted by the frames,to train a model each,and to finally integrate the outputs of the two models.Nevertheless,the reliance on the pre-extraction of the optical flow impedes the efficiency of action recognition,and the temporal and the spatial streams are just simply fused at the ends,with one stream failing and the other stream succeeding.We propose a novel hidden two-stream collaborative(HTSC)learning network that masks the steps of extracting the optical flow in the network and greatly speeds up the action recognition.Based on the two-stream method,the two-stream collaborative learning model captures the interaction of the temporal and spatial features to greatly enhance the accuracy of recognition.Our proposed method is highly capable of achieving the balance of efficiency and precision on large-scale video action recognition datasets.
文摘In this paper, the author analyzes characteristics and extracting method of interference signal of the distributed optical fiber sensing. In the distributed optical fiber sensing, realizing alarm and positioning function only through the cross-correlation operation will increase the load of the system, can make misinformation rate of the system be improved greatly. Therefore, before the localization algorithm, adding a interference signal feature recognition is very necessary, can reduce unnecessary operation loss and reduce the load of the system, also reducing the number of the false positives.
基金funded by the National Plan for Science,Technology and Innovation(MAARIFAH),King Abdulaziz City for Science and Technology,Kingdom of Saudi Arabia,Award Number(5-18-03-001-0004)。
文摘Braille-assistive technologies have helped blind people to write,read,learn,and communicate with sighted individuals for many years.These technologies enable blind people to engage with society and help break down communication barriers in their lives.The Optical Braille Recognition(OBR)system is one example of these technologies.It plays an important role in facilitating communication between sighted and blind people and assists sighted individuals in the reading and understanding of the documents of Braille cells.However,a clear gap exists in current OBR systems regarding asymmetric multilingual conversion of Braille documents.Few systems allow sighted people to read and understand Braille documents for self-learning applications.In this study,we propose a deep learning-based approach to convert Braille images into multilingual texts.This is achieved through a set of effective steps that start with image acquisition and preprocessing and end with a Braille multilingual mapping step.We develop a deep convolutional neural network(DCNN)model that takes its inputs from the second step of the approach for recognizing Braille cells.Several experiments are conducted on two datasets of Braille images to evaluate the performance of the DCNN model.The rst dataset contains 1,404 labeled images of 27 Braille symbols representing the alphabet characters.The second dataset consists of 5,420 labeled images of 37 Braille symbols that represent alphabet characters,numbers,and punctuation.The proposed model achieved a classication accuracy of 99.28%on the test set of the rst dataset and 98.99%on the test set of the second dataset.These results conrm the applicability of the DCNN model used in our proposed approach for multilingual Braille conversion in communicating with sighted people.
文摘Automated recognition of a person is one of the most critical issues in the modern society. Common biometric systems rely on the surface topography of an object and, thus, are potentially vulnerable for spoofing. Optical coherence tomography is a technology that has the capability to probe the internal structure of multilayered tissues. The paper describes an algorithm for automation fingerprint recognition that the algorithm is applied on the OCT fingerprint images. This algorithm is based on scanning of the enhanced and segmented OCT images.