The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recogni...The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recognition system that allows the isolation,the extraction,and the recognition of text in the case of documents having a textured background,a degraded aspect of colors,and of poor quality,and to synthesize it into speech.This system basically consists of three algorithms:a text localization and detection algorithm based on mathematical morphology method(MMM);a text extraction algorithm based on the gamma correction method(GCM);and an optical character recognition(OCR)algorithm for text recognition.A detailed complexity study of the different blocks of this text recognition system has been realized.Following this study,an acceleration of the GCM algorithm(AGCM)is proposed.The AGCM algorithm has reduced the complexity in the text recognition system by 70%and kept the same quality of text recognition as that of the original method.To assist visually impaired persons,a graphical interface of the entire text recognition chain has been developed,allowing the capture of images from a camera,rapid and intuitive visualization of the recognized text from this image,and text-to-speech synthesis.Our text recognition system provides an improvement of 6.8%for the recognition rate and 7.6%for the F-measure relative to GCM and AGCM algorithms.展开更多
The LPC “Linear Predictive Coding” algorithm is a widely used technique for voice coder. In this paper we present different implementations of the LPC algorithm used in the majority of voice decoding standard. The w...The LPC “Linear Predictive Coding” algorithm is a widely used technique for voice coder. In this paper we present different implementations of the LPC algorithm used in the majority of voice decoding standard. The windowing/autocorrelation bloc is implemented by three different versions on an FPGA Spartan 3. Allowing the possibility to integrate a Microblaze processor core a first solution consists of a pure software implementation of the LPC using this core RISC processor. Second solution is a pure hardware architecture implemented using VHDL based methodology starting from description until integration. Finally, the autocorrelation core is then proposed to be implemented using hardware/software (HW/SW) architecture with the existing processor. Each architecture performances are compared for different data lengths.展开更多
Hajj and Umrah are two main religious duties for Muslims.To help faithfuls to perform their religious duties comfortably in overcrowded areas,a crowd management system is a must to control the entering and exiting for...Hajj and Umrah are two main religious duties for Muslims.To help faithfuls to perform their religious duties comfortably in overcrowded areas,a crowd management system is a must to control the entering and exiting for each place.Since the number of people is very high,an intelligent crowd management system can be developed to reduce human effort and accelerate the management process.In this work,we propose a crowd management process based on detecting,tracking,and counting human faces using Artificial Intelligence techniques.Human detection and counting will be performed to calculate the number of existing visitors and face detection and tracking will be used to identify all the humans for security purposes.The proposed crowd management system is composed form three main parts which are:(1)detecting human faces,(2)assigning each detected face with a numerical identifier,(3)storing the identity of each face in a database for further identification and tracking.The main contribution of this work focuses on the detection and tracking model which is based on an improved object detection model.The improved Yolo v4 was used for face detection and tracking.It has been very effective in detecting small objects in highresolution images.The novelty contained in thismethod was the integration of the adaptive attention mechanism to improve the performance of the model for the desired task.Channel wise attention mechanism was applied to the output layers while both channel wise and spatial attention was integrated in the building blocks.The main idea from the adaptive attention mechanisms is to make themodel focus more on the target and ignore false positive proposals.We demonstrated the efficiency of the proposed method through expensive experimentation on a publicly available dataset.The wider faces dataset was used for the train and the evaluation of the proposed detection and tracking model.The proposed model has achieved good results with 91.2%of mAP and a processing speed of 18 FPS on the Nvidia GTX 960 GPU.展开更多
Desertification has become a global threat and caused a crisis,especially in Middle Eastern countries,such as Saudi Arabia.Makkah is one of the most important cities in Saudi Arabia that needs to be protected from des...Desertification has become a global threat and caused a crisis,especially in Middle Eastern countries,such as Saudi Arabia.Makkah is one of the most important cities in Saudi Arabia that needs to be protected from desertification.The vegetation area in Makkah has been damaged because of desertification through wind,floods,overgrazing,and global climate change.The damage caused by desertification can be recovered provided urgent action is taken to prevent further degradation of the vegetation area.In this paper,we propose an automatic desertification detection system based on Deep Learning techniques.Aerial images are classified using Convolutional Neural Networks(CNN)to detect land state variation in real-time.CNNs have been widely used for computer vision applications,such as image classification,image segmentation,and quality enhancement.The proposed CNN model was trained and evaluated on the Arial Image Dataset(AID).Compared to state-of-the-art methods,the proposed model has better performance while being suitable for embedded implementation.It has achieved high efficiency with 96.47% accuracy.In light of the current research,we assert the appropriateness of the proposed CNN model in detecting desertification from aerial images.展开更多
Autonomous vehicle is a vehicle that can guide itself without human conduction.It is capable of sensing its environment and moving with little or no human input.This kind of vehicle has become a concrete reality and m...Autonomous vehicle is a vehicle that can guide itself without human conduction.It is capable of sensing its environment and moving with little or no human input.This kind of vehicle has become a concrete reality and may pave the way for future systems where computers take over the art of driving.Advanced artificial intelligence control systems interpret sensory information to identify appropriate navigation paths,as well as obstacles and relevant road signs.In this paper,we introduce an intelligent road signs classifier to help autonomous vehicles to recognize and understand road signs.The road signs classifier based on an artificial intelligence technique.In particular,a deep learning model is used,Convolutional Neural Networks(CNN).CNN is a widely used Deep Learning model to solve pattern recognition problems like image classification and object detection.CNN has successfully used to solve computer vision problems because of its methodology in processing images that are similar to the human brain decision making.The evaluation of the proposed pipeline was trained and tested using two different datasets.The proposed CNNs achieved high performance in road sign classification with a validation accuracy of 99.8%and a testing accuracy of 99.6%.The proposed method can be easily implemented for real time application.展开更多
Indoor Scene understanding and indoor objects detection is a complex high-level task for automated systems applied to natural environments.Indeed,such a task requires huge annotated indoor images to train and test int...Indoor Scene understanding and indoor objects detection is a complex high-level task for automated systems applied to natural environments.Indeed,such a task requires huge annotated indoor images to train and test intelligent computer vision applications.One of the challenging questions is to adopt and to enhance technologies to assist indoor navigation for visually impaired people(VIP)and thus improve their daily life quality.This paper presents a new labeled indoor object dataset elaborated with a goal of indoor object detection(useful for indoor localization and navigation tasks).This dataset consists of 8000 indoor images containing 16 different indoor landmark objects and classes.The originality of the annotations comes from two new facts taken into account:(1)the spatial relationships between objects present in the scene and(2)actions possible to apply to those objects(relationships between VIP and an object).This collected dataset presents many specifications and strengths as it presents various data under various lighting conditions and complex image background to ensure more robustness when training and testing objects detectors.The proposed dataset,ready for use,provides 16 vital indoor object classes in order to contribute for indoor assistance navigation for VIP.展开更多
Computation of stereoscopic depth and disparity map extraction are dynamic research topics.A large variety of algorithms has been developed,among which we cite feature matching, moment extraction, and image representa...Computation of stereoscopic depth and disparity map extraction are dynamic research topics.A large variety of algorithms has been developed,among which we cite feature matching, moment extraction, and image representation using descriptors to determine a disparity map. This paper proposes a new method for stereo matching based on Fourier descriptors. The robustness of these descriptors under photometric and geometric transformations provides a better representation of a template or a local region in the image. In our work, we specifically use generalized Fourier descriptors to compute a robust cost function.Then, a box filter is applied for cost aggregation to enforce a smoothness constraint between neighboring pixels. Optimization and disparity calculation are done using dynamic programming, with a cost based on similarity between generalized Fourier descriptors using Euclidean distance. This local cost function is used to optimize correspondences. Our stereo matching algorithm is evaluated using the Middlebury stereo benchmark; our approach has been implemented on parallel high-performance graphics hardware using CUDA to accelerate our algorithm, giving a real-time implementation.展开更多
基金This work was funded by the Deanship of Scientific Research at Jouf University under Grant Number(DSR2022-RG-0114).
文摘The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recognition system that allows the isolation,the extraction,and the recognition of text in the case of documents having a textured background,a degraded aspect of colors,and of poor quality,and to synthesize it into speech.This system basically consists of three algorithms:a text localization and detection algorithm based on mathematical morphology method(MMM);a text extraction algorithm based on the gamma correction method(GCM);and an optical character recognition(OCR)algorithm for text recognition.A detailed complexity study of the different blocks of this text recognition system has been realized.Following this study,an acceleration of the GCM algorithm(AGCM)is proposed.The AGCM algorithm has reduced the complexity in the text recognition system by 70%and kept the same quality of text recognition as that of the original method.To assist visually impaired persons,a graphical interface of the entire text recognition chain has been developed,allowing the capture of images from a camera,rapid and intuitive visualization of the recognized text from this image,and text-to-speech synthesis.Our text recognition system provides an improvement of 6.8%for the recognition rate and 7.6%for the F-measure relative to GCM and AGCM algorithms.
文摘The LPC “Linear Predictive Coding” algorithm is a widely used technique for voice coder. In this paper we present different implementations of the LPC algorithm used in the majority of voice decoding standard. The windowing/autocorrelation bloc is implemented by three different versions on an FPGA Spartan 3. Allowing the possibility to integrate a Microblaze processor core a first solution consists of a pure software implementation of the LPC using this core RISC processor. Second solution is a pure hardware architecture implemented using VHDL based methodology starting from description until integration. Finally, the autocorrelation core is then proposed to be implemented using hardware/software (HW/SW) architecture with the existing processor. Each architecture performances are compared for different data lengths.
基金This work was funded by the University of Jeddah,Jeddah,Saudi Arabia,under Grant No.(UJ-21-ICL-4)The authors,therefore,acknowledge with thanks the University of Jeddah technical and financial support.
文摘Hajj and Umrah are two main religious duties for Muslims.To help faithfuls to perform their religious duties comfortably in overcrowded areas,a crowd management system is a must to control the entering and exiting for each place.Since the number of people is very high,an intelligent crowd management system can be developed to reduce human effort and accelerate the management process.In this work,we propose a crowd management process based on detecting,tracking,and counting human faces using Artificial Intelligence techniques.Human detection and counting will be performed to calculate the number of existing visitors and face detection and tracking will be used to identify all the humans for security purposes.The proposed crowd management system is composed form three main parts which are:(1)detecting human faces,(2)assigning each detected face with a numerical identifier,(3)storing the identity of each face in a database for further identification and tracking.The main contribution of this work focuses on the detection and tracking model which is based on an improved object detection model.The improved Yolo v4 was used for face detection and tracking.It has been very effective in detecting small objects in highresolution images.The novelty contained in thismethod was the integration of the adaptive attention mechanism to improve the performance of the model for the desired task.Channel wise attention mechanism was applied to the output layers while both channel wise and spatial attention was integrated in the building blocks.The main idea from the adaptive attention mechanisms is to make themodel focus more on the target and ignore false positive proposals.We demonstrated the efficiency of the proposed method through expensive experimentation on a publicly available dataset.The wider faces dataset was used for the train and the evaluation of the proposed detection and tracking model.The proposed model has achieved good results with 91.2%of mAP and a processing speed of 18 FPS on the Nvidia GTX 960 GPU.
基金by Makkah Digital Gate Initiative under grant no.(MDP-IRI-3-2020).
文摘Desertification has become a global threat and caused a crisis,especially in Middle Eastern countries,such as Saudi Arabia.Makkah is one of the most important cities in Saudi Arabia that needs to be protected from desertification.The vegetation area in Makkah has been damaged because of desertification through wind,floods,overgrazing,and global climate change.The damage caused by desertification can be recovered provided urgent action is taken to prevent further degradation of the vegetation area.In this paper,we propose an automatic desertification detection system based on Deep Learning techniques.Aerial images are classified using Convolutional Neural Networks(CNN)to detect land state variation in real-time.CNNs have been widely used for computer vision applications,such as image classification,image segmentation,and quality enhancement.The proposed CNN model was trained and evaluated on the Arial Image Dataset(AID).Compared to state-of-the-art methods,the proposed model has better performance while being suitable for embedded implementation.It has achieved high efficiency with 96.47% accuracy.In light of the current research,we assert the appropriateness of the proposed CNN model in detecting desertification from aerial images.
文摘Autonomous vehicle is a vehicle that can guide itself without human conduction.It is capable of sensing its environment and moving with little or no human input.This kind of vehicle has become a concrete reality and may pave the way for future systems where computers take over the art of driving.Advanced artificial intelligence control systems interpret sensory information to identify appropriate navigation paths,as well as obstacles and relevant road signs.In this paper,we introduce an intelligent road signs classifier to help autonomous vehicles to recognize and understand road signs.The road signs classifier based on an artificial intelligence technique.In particular,a deep learning model is used,Convolutional Neural Networks(CNN).CNN is a widely used Deep Learning model to solve pattern recognition problems like image classification and object detection.CNN has successfully used to solve computer vision problems because of its methodology in processing images that are similar to the human brain decision making.The evaluation of the proposed pipeline was trained and tested using two different datasets.The proposed CNNs achieved high performance in road sign classification with a validation accuracy of 99.8%and a testing accuracy of 99.6%.The proposed method can be easily implemented for real time application.
文摘Indoor Scene understanding and indoor objects detection is a complex high-level task for automated systems applied to natural environments.Indeed,such a task requires huge annotated indoor images to train and test intelligent computer vision applications.One of the challenging questions is to adopt and to enhance technologies to assist indoor navigation for visually impaired people(VIP)and thus improve their daily life quality.This paper presents a new labeled indoor object dataset elaborated with a goal of indoor object detection(useful for indoor localization and navigation tasks).This dataset consists of 8000 indoor images containing 16 different indoor landmark objects and classes.The originality of the annotations comes from two new facts taken into account:(1)the spatial relationships between objects present in the scene and(2)actions possible to apply to those objects(relationships between VIP and an object).This collected dataset presents many specifications and strengths as it presents various data under various lighting conditions and complex image background to ensure more robustness when training and testing objects detectors.The proposed dataset,ready for use,provides 16 vital indoor object classes in order to contribute for indoor assistance navigation for VIP.
文摘Computation of stereoscopic depth and disparity map extraction are dynamic research topics.A large variety of algorithms has been developed,among which we cite feature matching, moment extraction, and image representation using descriptors to determine a disparity map. This paper proposes a new method for stereo matching based on Fourier descriptors. The robustness of these descriptors under photometric and geometric transformations provides a better representation of a template or a local region in the image. In our work, we specifically use generalized Fourier descriptors to compute a robust cost function.Then, a box filter is applied for cost aggregation to enforce a smoothness constraint between neighboring pixels. Optimization and disparity calculation are done using dynamic programming, with a cost based on similarity between generalized Fourier descriptors using Euclidean distance. This local cost function is used to optimize correspondences. Our stereo matching algorithm is evaluated using the Middlebury stereo benchmark; our approach has been implemented on parallel high-performance graphics hardware using CUDA to accelerate our algorithm, giving a real-time implementation.