Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,fo...Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,for example,handlingmultiple class images,images that get augmented when captured by a camera and so on.The test images include all these variants as well.These detection models alert them about their surroundings when they want to walk independently.This study compares four CNN-based pre-trainedmodels:ResidualNetwork(ResNet-50),Inception v3,DenseConvolutional Network(DenseNet-121),and SqueezeNet,predominantly used in image recognition applications.Based on the analysis performed on these test images,the study infers that Inception V3 outperformed other pre-trained models in terms of accuracy and speed.To further improve the performance of the Inception v3 model,the thermal exchange optimization(TEO)algorithm is applied to tune the hyperparameters(number of epochs,batch size,and learning rate)showing the novelty of the work.Better accuracy was achieved owing to the inclusion of an auxiliary classifier as a regularizer,hyperparameter optimizer,and factorization approach.Additionally,Inception V3 can handle images of different sizes.This makes Inception V3 the optimum model for assisting visually challenged people in real-world communication when integrated with Internet of Things(IoT)-based devices.展开更多
Visual impairment is one of the major problems among people of all age groups across the globe.Visually Impaired Persons(VIPs)require help from others to carry out their day-to-day tasks.Since they experience several ...Visual impairment is one of the major problems among people of all age groups across the globe.Visually Impaired Persons(VIPs)require help from others to carry out their day-to-day tasks.Since they experience several problems in their daily lives,technical intervention can help them resolve the challenges.In this background,an automatic object detection tool is the need of the hour to empower VIPs with safe navigation.The recent advances in the Internet of Things(IoT)and Deep Learning(DL)techniques make it possible.The current study proposes IoT-assisted Transient Search Optimization with a Lightweight RetinaNetbased object detection(TSOLWR-ODVIP)model to help VIPs.The primary aim of the presented TSOLWR-ODVIP technique is to identify different objects surrounding VIPs and to convey the information via audio message to them.For data acquisition,IoT devices are used in this study.Then,the Lightweight RetinaNet(LWR)model is applied to detect objects accurately.Next,the TSO algorithm is employed for fine-tuning the hyperparameters involved in the LWR model.Finally,the Long Short-Term Memory(LSTM)model is exploited for classifying objects.The performance of the proposed TSOLWR-ODVIP technique was evaluated using a set of objects,and the results were examined under distinct aspects.The comparison study outcomes confirmed that the TSOLWR-ODVIP model could effectually detect and classify the objects,enhancing the quality of life of VIPs.展开更多
Vision impairment is a latent problem that affects numerous people across the globe.Technological advancements,particularly the rise of computer processing abilities like Deep Learning(DL)models and emergence of weara...Vision impairment is a latent problem that affects numerous people across the globe.Technological advancements,particularly the rise of computer processing abilities like Deep Learning(DL)models and emergence of wearables pave a way for assisting visually-impaired persons.The models developed earlier specifically for visually-impaired people work effectually on single object detection in unconstrained environment.But,in real-time scenarios,these systems are inconsistent in providing effective guidance for visually-impaired people.In addition to object detection,extra information about the location of objects in the scene is essential for visually-impaired people.Keeping this in mind,the current research work presents an Efficient Object Detection Model with Audio Assistive System(EODM-AAS)using DL-based YOLO v3 model for visually-impaired people.The aim of the research article is to construct a model that can provide a detailed description of the objects around visually-impaired people.The presented model involves a DL-based YOLO v3 model for multi-label object detection.Besides,the presented model determines the position of object in the scene and finally generates an audio signal to notify the visually-impaired people.In order to validate the detection performance of the presented method,a detailed simulation analysis was conducted on four datasets.The simulation results established that the presented model produces effectual outcome over existing methods.展开更多
Indoor Scene understanding and indoor objects detection is a complex high-level task for automated systems applied to natural environments.Indeed,such a task requires huge annotated indoor images to train and test int...Indoor Scene understanding and indoor objects detection is a complex high-level task for automated systems applied to natural environments.Indeed,such a task requires huge annotated indoor images to train and test intelligent computer vision applications.One of the challenging questions is to adopt and to enhance technologies to assist indoor navigation for visually impaired people(VIP)and thus improve their daily life quality.This paper presents a new labeled indoor object dataset elaborated with a goal of indoor object detection(useful for indoor localization and navigation tasks).This dataset consists of 8000 indoor images containing 16 different indoor landmark objects and classes.The originality of the annotations comes from two new facts taken into account:(1)the spatial relationships between objects present in the scene and(2)actions possible to apply to those objects(relationships between VIP and an object).This collected dataset presents many specifications and strengths as it presents various data under various lighting conditions and complex image background to ensure more robustness when training and testing objects detectors.The proposed dataset,ready for use,provides 16 vital indoor object classes in order to contribute for indoor assistance navigation for VIP.展开更多
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R191)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4310373DSR61)This study is supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2023/R/1444).
文摘Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,for example,handlingmultiple class images,images that get augmented when captured by a camera and so on.The test images include all these variants as well.These detection models alert them about their surroundings when they want to walk independently.This study compares four CNN-based pre-trainedmodels:ResidualNetwork(ResNet-50),Inception v3,DenseConvolutional Network(DenseNet-121),and SqueezeNet,predominantly used in image recognition applications.Based on the analysis performed on these test images,the study infers that Inception V3 outperformed other pre-trained models in terms of accuracy and speed.To further improve the performance of the Inception v3 model,the thermal exchange optimization(TEO)algorithm is applied to tune the hyperparameters(number of epochs,batch size,and learning rate)showing the novelty of the work.Better accuracy was achieved owing to the inclusion of an auxiliary classifier as a regularizer,hyperparameter optimizer,and factorization approach.Additionally,Inception V3 can handle images of different sizes.This makes Inception V3 the optimum model for assisting visually challenged people in real-world communication when integrated with Internet of Things(IoT)-based devices.
基金The authors extend their appreciation to the King Salman center for Disability Research for funding this work through Research Group no KSRG-2022-030。
文摘Visual impairment is one of the major problems among people of all age groups across the globe.Visually Impaired Persons(VIPs)require help from others to carry out their day-to-day tasks.Since they experience several problems in their daily lives,technical intervention can help them resolve the challenges.In this background,an automatic object detection tool is the need of the hour to empower VIPs with safe navigation.The recent advances in the Internet of Things(IoT)and Deep Learning(DL)techniques make it possible.The current study proposes IoT-assisted Transient Search Optimization with a Lightweight RetinaNetbased object detection(TSOLWR-ODVIP)model to help VIPs.The primary aim of the presented TSOLWR-ODVIP technique is to identify different objects surrounding VIPs and to convey the information via audio message to them.For data acquisition,IoT devices are used in this study.Then,the Lightweight RetinaNet(LWR)model is applied to detect objects accurately.Next,the TSO algorithm is employed for fine-tuning the hyperparameters involved in the LWR model.Finally,the Long Short-Term Memory(LSTM)model is exploited for classifying objects.The performance of the proposed TSOLWR-ODVIP technique was evaluated using a set of objects,and the results were examined under distinct aspects.The comparison study outcomes confirmed that the TSOLWR-ODVIP model could effectually detect and classify the objects,enhancing the quality of life of VIPs.
文摘Vision impairment is a latent problem that affects numerous people across the globe.Technological advancements,particularly the rise of computer processing abilities like Deep Learning(DL)models and emergence of wearables pave a way for assisting visually-impaired persons.The models developed earlier specifically for visually-impaired people work effectually on single object detection in unconstrained environment.But,in real-time scenarios,these systems are inconsistent in providing effective guidance for visually-impaired people.In addition to object detection,extra information about the location of objects in the scene is essential for visually-impaired people.Keeping this in mind,the current research work presents an Efficient Object Detection Model with Audio Assistive System(EODM-AAS)using DL-based YOLO v3 model for visually-impaired people.The aim of the research article is to construct a model that can provide a detailed description of the objects around visually-impaired people.The presented model involves a DL-based YOLO v3 model for multi-label object detection.Besides,the presented model determines the position of object in the scene and finally generates an audio signal to notify the visually-impaired people.In order to validate the detection performance of the presented method,a detailed simulation analysis was conducted on four datasets.The simulation results established that the presented model produces effectual outcome over existing methods.
文摘Indoor Scene understanding and indoor objects detection is a complex high-level task for automated systems applied to natural environments.Indeed,such a task requires huge annotated indoor images to train and test intelligent computer vision applications.One of the challenging questions is to adopt and to enhance technologies to assist indoor navigation for visually impaired people(VIP)and thus improve their daily life quality.This paper presents a new labeled indoor object dataset elaborated with a goal of indoor object detection(useful for indoor localization and navigation tasks).This dataset consists of 8000 indoor images containing 16 different indoor landmark objects and classes.The originality of the annotations comes from two new facts taken into account:(1)the spatial relationships between objects present in the scene and(2)actions possible to apply to those objects(relationships between VIP and an object).This collected dataset presents many specifications and strengths as it presents various data under various lighting conditions and complex image background to ensure more robustness when training and testing objects detectors.The proposed dataset,ready for use,provides 16 vital indoor object classes in order to contribute for indoor assistance navigation for VIP.