Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit...Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.展开更多
Object segmentation and recognition is an imperative area of computer vision andmachine learning that identifies and separates individual objects within an image or video and determines classes or categories based on ...Object segmentation and recognition is an imperative area of computer vision andmachine learning that identifies and separates individual objects within an image or video and determines classes or categories based on their features.The proposed system presents a distinctive approach to object segmentation and recognition using Artificial Neural Networks(ANNs).The system takes RGB images as input and uses a k-means clustering-based segmentation technique to fragment the intended parts of the images into different regions and label thembased on their characteristics.Then,two distinct kinds of features are obtained from the segmented images to help identify the objects of interest.An Artificial Neural Network(ANN)is then used to recognize the objects based on their features.Experiments were carried out with three standard datasets,MSRC,MS COCO,and Caltech 101 which are extensively used in object recognition research,to measure the productivity of the suggested approach.The findings from the experiment support the suggested system’s validity,as it achieved class recognition accuracies of 89%,83%,and 90.30% on the MSRC,MS COCO,and Caltech 101 datasets,respectively.展开更多
In the modern era of a growing population,it is arduous for humans to monitor every aspect of sports,events occurring around us,and scenarios or conditions.This recognition of different types of sports and events has ...In the modern era of a growing population,it is arduous for humans to monitor every aspect of sports,events occurring around us,and scenarios or conditions.This recognition of different types of sports and events has increasingly incorporated the use of machine learning and artificial intelligence.This research focuses on detecting and recognizing events in sequential photos characterized by several factors,including the size,location,and position of people’s body parts in those pictures,and the influence around those people.Common approaches utilized,here are feature descriptors such as MSER(Maximally Stable Extremal Regions),SIFT(Scale-Invariant Feature Transform),and DOF(degree of freedom)between the joint points are applied to the skeleton points.Moreover,for the same purposes,other features such as BRISK(Binary Robust Invariant Scalable Keypoints),ORB(Oriented FAST and Rotated BRIEF),and HOG(Histogram of Oriented Gradients)are applied on full body or silhouettes.The integration of these techniques increases the discriminative nature of characteristics retrieved in the identification process of the event,hence improving the efficiency and reliability of the entire procedure.These extracted features are passed to the early fusion and DBscan for feature fusion and optimization.Then deep belief,network is employed for recognition.Experimental results demonstrate a separate experiment’s detection average recognition rate of 87%in the HMDB51 video database and 89%in the YouTube database,showing a better perspective than the current methods in sports and event identification.展开更多
Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occ...Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.展开更多
The recent advancements in vision technology have had a significant impact on our ability to identify multiple objects and understand complex scenes.Various technologies,such as augmented reality-driven scene integrat...The recent advancements in vision technology have had a significant impact on our ability to identify multiple objects and understand complex scenes.Various technologies,such as augmented reality-driven scene integration,robotic navigation,autonomous driving,and guided tour systems,heavily rely on this type of scene comprehension.This paper presents a novel segmentation approach based on the UNet network model,aimed at recognizing multiple objects within an image.The methodology begins with the acquisition and preprocessing of the image,followed by segmentation using the fine-tuned UNet architecture.Afterward,we use an annotation tool to accurately label the segmented regions.Upon labeling,significant features are extracted from these segmented objects,encompassing KAZE(Accelerated Segmentation and Extraction)features,energy-based edge detection,frequency-based,and blob characteristics.For the classification stage,a convolution neural network(CNN)is employed.This comprehensive methodology demonstrates a robust framework for achieving accurate and efficient recognition of multiple objects in images.The experimental results,which include complex object datasets like MSRC-v2 and PASCAL-VOC12,have been documented.After analyzing the experimental results,it was found that the PASCAL-VOC12 dataset achieved an accuracy rate of 95%,while the MSRC-v2 dataset achieved an accuracy of 89%.The evaluation performed on these diverse datasets highlights a notably impressive level of performance.展开更多
Unmanned aerial vehicles(UAVs)can be used to monitor traffic in a variety of settings,including security,traffic surveillance,and traffic control.Numerous academics have been drawn to this topic because of the challen...Unmanned aerial vehicles(UAVs)can be used to monitor traffic in a variety of settings,including security,traffic surveillance,and traffic control.Numerous academics have been drawn to this topic because of the challenges and the large variety of applications.This paper proposes a new and efficient vehicle detection and tracking system that is based on road extraction and identifying objects on it.It is inspired by existing detection systems that comprise stationary data collectors such as induction loops and stationary cameras that have a limited field of view and are not mobile.The goal of this study is to develop a method that first extracts the region of interest(ROI),then finds and tracks the items of interest.The suggested system is divided into six stages.The photos from the obtained dataset are appropriately georeferenced to their actual locations in the first phase,after which they are all co-registered.The ROI,or road and its objects,are retrieved using the GrabCut method in the second phase.The third phase entails data preparation.The segmented images’noise is eliminated using Gaussian blur,after which the images are changed to grayscale and forwarded to the following stage for additional morphological procedures.The YOLOv3 algorithm is used in the fourth step to find any automobiles in the photos.Following that,the Kalman filter and centroid tracking are used to perform the tracking of the detected cars.The Lucas-Kanade method is then used to perform the trajectory analysis on the vehicles.The suggested model is put to the test and assessed using the Vehicle Aerial Imaging from Drone(VAID)dataset.For detection and tracking,the model was able to attain accuracy levels of 96.7%and 91.6%,respectively.展开更多
Road congestion,air pollution,and accident rates have all increased as a result of rising traffic density andworldwide population growth.Over the past ten years,the total number of automobiles has increased significan...Road congestion,air pollution,and accident rates have all increased as a result of rising traffic density andworldwide population growth.Over the past ten years,the total number of automobiles has increased significantly over the world.In this paper,a novel method for intelligent traffic surveillance is presented.The proposed model is based on multilabel semantic segmentation using a random forest classifier which classifies the images into five classes.To improve the results,mean-shift clustering was applied to the segmented images.Afterward,the pixels given the label for the vehicle were extracted and blob detection was applied to mark each vehicle.For the validation of each detection,a vehicle verification method based on the structural similarity index is proposed.The tracking of vehicles across the image frames is done using the Identifier(ID)assignment technique and particle filter.Also,vehicle counting in each frame along with trajectory estimation was done for each object.Our proposed system demonstrated a remarkable vehicle detection rate of 0.83 over Vehicle Aerial Imaging from Drone(VAID),0.86 over AU-AIR,and 0.75 over the Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)dataset during the experimental evaluation.The proposed system can be used for several purposes,such as vehicle identification in traffic,traffic density estimation at intersections,and traffic congestion sensing on a road.展开更多
Hand gesture recognition (HGR) is used in a numerous applications,including medical health-care, industrial purpose and sports detection.We have developed a real-time hand gesture recognition system using inertialsens...Hand gesture recognition (HGR) is used in a numerous applications,including medical health-care, industrial purpose and sports detection.We have developed a real-time hand gesture recognition system using inertialsensors for the smart home application. Developing such a model facilitatesthe medical health field (elders or disabled ones). Home automation has alsobeen proven to be a tremendous benefit for the elderly and disabled. Residentsare admitted to smart homes for comfort, luxury, improved quality of life,and protection against intrusion and burglars. This paper proposes a novelsystem that uses principal component analysis, linear discrimination analysisfeature extraction, and random forest as a classifier to improveHGRaccuracy.We have achieved an accuracy of 94% over the publicly benchmarked HGRdataset. The proposed system can be used to detect hand gestures in thehealthcare industry as well as in the industrial and educational sectors.展开更多
Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has at...Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.展开更多
In the past two decades,there has been a lot of work on computer vision technology that incorporates many tasks which implement basic filtering to image classification.Themajor research areas of this field include obj...In the past two decades,there has been a lot of work on computer vision technology that incorporates many tasks which implement basic filtering to image classification.Themajor research areas of this field include object detection and object recognition.Moreover,wireless communication technologies are presently adopted and they have impacted the way of education that has been changed.There are different phases of changes in the traditional system.Perception of three-dimensional(3D)from two-dimensional(2D)image is one of the demanding tasks.Because human can easily perceive but making 3D using software will take time manually.Firstly,the blackboard has been replaced by projectors and other digital screens so such that people can understand the concept better through visualization.Secondly,the computer labs in schools are now more common than ever.Thirdly,online classes have become a reality.However,transferring to online education or e-learning is not without challenges.Therefore,we propose a method for improving the efficiency of e-learning.Our proposed system consists of twoand-a-half dimensional(2.5D)features extraction using machine learning and image processing.Then,these features are utilized to generate 3D mesh using ellipsoidal deformation method.After that,3D bounding box estimation is applied.Our results show that there is a need to move to 3D virtual reality(VR)with haptic sensors in the field of e-learning for a better understanding of real-world objects.Thus,people will have more information as compared to the traditional or simple online education tools.We compare our result with the ShapeNet dataset to check the accuracy of our proposed method.Our proposed system achieved an accuracy of 90.77%on plane class,85.72%on chair class,and car class have 72.14%.Mean accuracy of our method is 70.89%.展开更多
In the last decade,there has been remarkable progress in the areas of object detection and recognition due to high-quality color images along with their depth maps provided by RGB-D cameras.They enable artificially in...In the last decade,there has been remarkable progress in the areas of object detection and recognition due to high-quality color images along with their depth maps provided by RGB-D cameras.They enable artificially intelligent machines to easily detect and recognize objects and make real-time decisions according to the given scenarios.Depth cues can improve the quality of object detection and recognition.The main purpose of this research study to find an optimized way of object detection and identification we propose techniques of object detection using two RGB-D datasets.The proposed methodology extracts image normally from depth maps and then performs clustering using the Modified Watson Mixture Model(mWMM).mWMM is challenging to handle when the quality of the image is not good.Hence,the proposed RGB-D-based system uses depth cues for segmentation with the help of mWMM.Then it extracts multiple features from the segmented images.The selected features are fed to the Artificial Neural Network(ANN)and Convolutional Neural Network(CNN)for detecting objects.We achieved 92.13%of mean accuracy over NYUv1 dataset and 90.00%of mean accuracy for the Redweb_v1 dataset.Finally,their results are compared and the proposed model with CNN outperforms other state-of-the-art methods.The proposed architecture can be used in autonomous cars,traffic monitoring,and sports scenes.展开更多
Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of compu...Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of computer vision.Hence,developing a surveillance system with multiple object recognition and tracking,especially in low light and night-time,is still challenging.Therefore,we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night.In particular,we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared(IR)images using machine learning and tracking them using particle filters.Moreover,a random forest classifier is adopted for image segmentation to identify pedestrians in an image.The result of detection is investigated by particle filter to solve pedestrian tracking.Through the extensive experiment,our system shows 93%segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes.Moreover,the system achieved a detection accuracy of 90%usingmultiple templatematching techniques and 81%accuracy for pedestrian tracking.Furthermore,our system can identify that the detected object is a human.Hence,our system provided the best results compared to the state-ofart systems,which proves the effectiveness of the techniques used for image segmentation,classification,and tracking.The presented method is applicable for human detection/tracking,crowd analysis,and monitoring pedestrians in IR video surveillance.展开更多
Crowd management becomes a global concern due to increased population in urban areas.Better management of pedestrians leads to improved use of public places.Behavior of pedestrian’s is a major factor of crowd managem...Crowd management becomes a global concern due to increased population in urban areas.Better management of pedestrians leads to improved use of public places.Behavior of pedestrian’s is a major factor of crowd management in public places.There are multiple applications available in this area but the challenge is open due to complexity of crowd and depends on the environment.In this paper,we have proposed a new method for pedestrian’s behavior detection.Kalman filter has been used to detect pedestrian’s usingmovement based approach.Next,we have performed occlusion detection and removal using region shrinking method to isolate occluded humans.Human verification is performed on each human silhouette and wavelet analysis and particle gradient motion are extracted for each silhouettes.Gray Wolf Optimizer(GWO)has been utilized to optimize feature set and then behavior classification has been performed using the Extreme Gradient(XG)Boost classifier.Performance has been evaluated using pedestrian’s data from avenue and UBI-Fight datasets,where both have different environment.The mean achieved accuracies are 91.3%and 85.14%over the Avenue and UBI-Fight datasets,respectively.These results are more accurate as compared to other existing methods.展开更多
With the dramatic increase in video surveillance applications and public safety measures,the need for an accurate and effective system for abnormal/sus-picious activity classification also increases.Although it has mul...With the dramatic increase in video surveillance applications and public safety measures,the need for an accurate and effective system for abnormal/sus-picious activity classification also increases.Although it has multiple applications,the problem is very challenging.In this paper,a novel approach for detecting nor-mal/abnormal activity has been proposed.We used the Gaussian Mixture Model(GMM)and Kalmanfilter to detect and track the objects,respectively.After that,we performed shadow removal to segment an object and its shadow.After object segmentation we performed occlusion detection method to detect occlusion between multiple human silhouettes and we implemented a novel method for region shrinking to isolate occluded humans.Fuzzy c-mean is utilized to verify human silhouettes and motion based features including velocity and opticalflow are extracted for each identified silhouettes.Gray Wolf Optimizer(GWO)is used to optimize feature set followed by abnormal event classification that is performed using the XG-Boost classifier.This system is applicable in any surveillance appli-cation used for event detection or anomaly detection.Performance of proposed system is evaluated using University of Minnesota(UMN)dataset and UBI(Uni-versity of Beira Interior)-Fight dataset,each having different type of anomaly.The mean accuracy for the UMN and UBI-Fight datasets is 90.14%and 76.9%respec-tively.These results are more accurate as compared to other existing methods.展开更多
Due to the recently increased requirements of e-learning systems,multiple educational institutes such as kindergarten have transformed their learning towards virtual education.Automated student health exercise is a di...Due to the recently increased requirements of e-learning systems,multiple educational institutes such as kindergarten have transformed their learning towards virtual education.Automated student health exercise is a difficult task but an important one due to the physical education needs especially in young learners.The proposed system focuses on the necessary implementation of student health exercise recognition(SHER)using a modified Quaternion-basedfilter for inertial data refining and data fusion as the pre-processing steps.Further,cleansed data has been segmented using an overlapping windowing approach followed by patterns identification in the form of static and kinematic signal patterns.Furthermore,these patterns have been utilized to extract cues for both patterned signals,which are further optimized using Fisher’s linear discriminant analysis(FLDA)technique.Finally,the physical exercise activities have been categorized using extended Kalmanfilter(EKF)-based neural networks.This system can be implemented in multiple educational establishments including intelligent training systems,virtual mentors,smart simulations,and interactive learning management methods.展开更多
Latest advancements in vision technology offer an evident impact on multi-object recognition and scene understanding.Such sceneunderstanding task is a demanding part of several technologies,like augmented reality-base...Latest advancements in vision technology offer an evident impact on multi-object recognition and scene understanding.Such sceneunderstanding task is a demanding part of several technologies,like augmented reality-based scene integration,robotic navigation,autonomous driving,and tourist guide.Incorporating visual information in contextually unified segments,convolution neural networks-based approaches will significantly mitigate the clutter,which is usual in classical frameworks during scene understanding.In this paper,we propose a convolutional neural network(CNN)based segmentation method for the recognition of multiple objects in an image.Initially,after acquisition and preprocessing,the image is segmented by using CNN.Then,CNN features are extracted from these segmented objects,and discrete cosine transform(DCT)and discrete wavelet transform(DWT)features are computed.After the extraction of CNN features and computation of classical machine learning features,fusion is performed using a fusion technique.Then,to select theminimal set of features,genetic algorithm-based feature selection is used.In order to recognize and understand the multi-objects in the scene,a neuro-fuzzy approach is applied.Once objects in the scene are recognized,the relationship between these objects is examined by employing the object-to-object relation approach.Finally,a decision tree is incorporated to assign the relevant labels to the scenes based on recognized objects in the image.The experimental results over complex scene datasets including SUN Red Green Blue-Depth(RGB-D)and Cityscapes’demonstrated a remarkable performance.展开更多
In this research work,an efficient sign language recognition tool for e-learning has been proposed with a new type of feature set based on angle and lines.This feature set has the ability to increase the overall perfo...In this research work,an efficient sign language recognition tool for e-learning has been proposed with a new type of feature set based on angle and lines.This feature set has the ability to increase the overall performance of machine learning algorithms in an efficient way.The hand gesture recognition based on these features has been implemented for usage in real-time.The feature set used hand landmarks,which were generated using media-pipe(MediaPipe)and open computer vision(openCV)on each frame of the incoming video.The overall algorithm has been tested on two well-known ASLalphabet(American Sign Language)and ISL-HS(Irish Sign Language)sign language datasets.Different machine learning classifiers including random forest,decision tree,and naïve Bayesian have been used to classify hand gestures using this unique feature set and their respective results have been compared.Since the random forest classifier performed better,it has been selected as the base classifier for the proposed system.It showed 96.7%accuracy with ISL-HS and 93.7%accuracy with ASL-alphabet dataset using the extracted features.展开更多
E-learning approaches are one of the most important learning platforms for the learner through electronic equipment.Such study techniques are useful for other groups of learners such as the crowd,pedestrian,sports,tra...E-learning approaches are one of the most important learning platforms for the learner through electronic equipment.Such study techniques are useful for other groups of learners such as the crowd,pedestrian,sports,transports,communication,emergency services,management systems and education sectors.E-learning is still a challenging domain for researchers and developers to find new trends and advanced tools and methods.Many of them are currently working on this domain to fulfill the requirements of industry and the environment.In this paper,we proposed a method for pedestrian behavior mining of aerial data,using deep flow feature,graph mining technique,and convocational neural network.For input data,the state-of-the-art crowd activity University of Minnesota(UMN)dataset is adopted,which contains the aerial indoor and outdoor view of the pedestrian,for simplification of extra information and computational cost reduction the pre-processing is applied.Deep flow features are extracted to find more accurate information.Furthermore,to deal with repetition in features data and features mining the graph mining algorithm is applied,while Convolution Neural Network(CNN)is applied for pedestrian behavior mining.The proposed method shows 84.50%of mean accuracy and a 15.50%of error rate.Therefore,the achieved results show more accuracy as compared to state-ofthe-art classification algorithms such as decision tree,artificial neural network(ANN).展开更多
Over the last decade,there is a surge of attention in establishing ambient assisted living(AAL)solutions to assist individuals live independently.With a social and economic perspective,the demographic shift toward an ...Over the last decade,there is a surge of attention in establishing ambient assisted living(AAL)solutions to assist individuals live independently.With a social and economic perspective,the demographic shift toward an elderly population has brought new challenges to today’s society.AAL can offer a variety of solutions for increasing people’s quality of life,allowing them to live healthier and more independently for longer.In this paper,we have proposed a novel AAL solution using a hybrid bidirectional long-term and short-term memory networks(BiLSTM)and convolutional neural network(CNN)classifier.We first pre-processed the signal data,then used timefrequency features such as signal energy,signal variance,signal frequency,empirical mode,and empirical mode decomposition.The convolutional neural network-bidirectional long-term and short-term memory(CNN-biLSTM)classifier with dimensional reduction isomap algorithm was then used to select ideal features.We assessed the performance of our proposed system on the publicly accessible human gait database(HuGaDB)benchmark dataset and achieved an accuracy rates of 93.95 percent,respectively.Experiments reveal that hybrid method gives more accuracy than single classifier in AAL model.The suggested system can assists persons with impairments,assisting carers and medical personnel.展开更多
Nowadays,activities of daily living(ADL)recognition system has been considered an important field of computer vision.Wearable and optical sensors are widely used to assess the daily living activities in healthy people...Nowadays,activities of daily living(ADL)recognition system has been considered an important field of computer vision.Wearable and optical sensors are widely used to assess the daily living activities in healthy people and people with certain disorders.Although conventional ADL utilizes RGB optical sensors but an RGB-D camera with features of identifying depth(distance information)and visual cues has greatly enhanced the performance of activity recognition.In this paper,an RGB-D-based ADL recognition system has been presented.Initially,human silhouette has been extracted from the noisy background of RGB and depth images to track human movement in a scene.Based on these silhouettes,full body features and point based features have been extracted which are further optimized with probability based incremental learning(PBIL)algorithm.Finally,random forest classifier has been used to classify activities into different categories.The n-fold crossvalidation scheme has been used to measure the viability of the proposed model on the RGBD-AC benchmark dataset and has achieved an accuracy of 92.71%over other state-of-the-art methodologies.展开更多
基金supported by a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT),Republic of KoreaThe authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/13/40)+2 种基金Also,the authors are thankful to Prince Satam bin Abdulaziz University for supporting this study via funding from Prince Satam bin Abdulaziz University project number(PSAU/2024/R/1445)This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R54)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.
基金supported by the MSIT(Ministry of Science and ICT)Korea,under the ITRC(Information Technology Research Center)Support Program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)+1 种基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R410),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/12/6).
文摘Object segmentation and recognition is an imperative area of computer vision andmachine learning that identifies and separates individual objects within an image or video and determines classes or categories based on their features.The proposed system presents a distinctive approach to object segmentation and recognition using Artificial Neural Networks(ANNs).The system takes RGB images as input and uses a k-means clustering-based segmentation technique to fragment the intended parts of the images into different regions and label thembased on their characteristics.Then,two distinct kinds of features are obtained from the segmented images to help identify the objects of interest.An Artificial Neural Network(ANN)is then used to recognize the objects based on their features.Experiments were carried out with three standard datasets,MSRC,MS COCO,and Caltech 101 which are extensively used in object recognition research,to measure the productivity of the suggested approach.The findings from the experiment support the suggested system’s validity,as it achieved class recognition accuracies of 89%,83%,and 90.30% on the MSRC,MS COCO,and Caltech 101 datasets,respectively.
基金the MSIT(Ministry of Science and ICT),Korea,under the ICAN(ICT Challenge and Advanced Network of HRD)Program(IITP-2024-RS-2022-00156326)the IITP(Institute of Information&Communications Technology Planning&Evaluation).Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2024R440)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.This research was supported by the Deanship of Scientific Research at Najran University,under the Research Group Funding program grant code(NU/RG/SERC/13/30).
文摘In the modern era of a growing population,it is arduous for humans to monitor every aspect of sports,events occurring around us,and scenarios or conditions.This recognition of different types of sports and events has increasingly incorporated the use of machine learning and artificial intelligence.This research focuses on detecting and recognizing events in sequential photos characterized by several factors,including the size,location,and position of people’s body parts in those pictures,and the influence around those people.Common approaches utilized,here are feature descriptors such as MSER(Maximally Stable Extremal Regions),SIFT(Scale-Invariant Feature Transform),and DOF(degree of freedom)between the joint points are applied to the skeleton points.Moreover,for the same purposes,other features such as BRISK(Binary Robust Invariant Scalable Keypoints),ORB(Oriented FAST and Rotated BRIEF),and HOG(Histogram of Oriented Gradients)are applied on full body or silhouettes.The integration of these techniques increases the discriminative nature of characteristics retrieved in the identification process of the event,hence improving the efficiency and reliability of the entire procedure.These extracted features are passed to the early fusion and DBscan for feature fusion and optimization.Then deep belief,network is employed for recognition.Experimental results demonstrate a separate experiment’s detection average recognition rate of 87%in the HMDB51 video database and 89%in the YouTube database,showing a better perspective than the current methods in sports and event identification.
基金a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT)Republic of Korea.This research is supported and funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding program Grant Code(NU/RG/SERC/12/6).
文摘Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ICAN(ICT Challenge and Advanced Network of HRD)Program(IITP-2024-RS-2022-00156326)supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation)+2 种基金The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/GP/SERC/13/30)funding for this work was provided by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University,Arar,KSA for funding this research work through the Project Number“NBU-FFR-2024-231-06”.
文摘The recent advancements in vision technology have had a significant impact on our ability to identify multiple objects and understand complex scenes.Various technologies,such as augmented reality-driven scene integration,robotic navigation,autonomous driving,and guided tour systems,heavily rely on this type of scene comprehension.This paper presents a novel segmentation approach based on the UNet network model,aimed at recognizing multiple objects within an image.The methodology begins with the acquisition and preprocessing of the image,followed by segmentation using the fine-tuned UNet architecture.Afterward,we use an annotation tool to accurately label the segmented regions.Upon labeling,significant features are extracted from these segmented objects,encompassing KAZE(Accelerated Segmentation and Extraction)features,energy-based edge detection,frequency-based,and blob characteristics.For the classification stage,a convolution neural network(CNN)is employed.This comprehensive methodology demonstrates a robust framework for achieving accurate and efficient recognition of multiple objects in images.The experimental results,which include complex object datasets like MSRC-v2 and PASCAL-VOC12,have been documented.After analyzing the experimental results,it was found that the PASCAL-VOC12 dataset achieved an accuracy rate of 95%,while the MSRC-v2 dataset achieved an accuracy of 89%.The evaluation performed on these diverse datasets highlights a notably impressive level of performance.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ICAN(ICT Challenge and Advanced Network of HRD)program(IITP-2023-RS-2022-00156326)supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation).
文摘Unmanned aerial vehicles(UAVs)can be used to monitor traffic in a variety of settings,including security,traffic surveillance,and traffic control.Numerous academics have been drawn to this topic because of the challenges and the large variety of applications.This paper proposes a new and efficient vehicle detection and tracking system that is based on road extraction and identifying objects on it.It is inspired by existing detection systems that comprise stationary data collectors such as induction loops and stationary cameras that have a limited field of view and are not mobile.The goal of this study is to develop a method that first extracts the region of interest(ROI),then finds and tracks the items of interest.The suggested system is divided into six stages.The photos from the obtained dataset are appropriately georeferenced to their actual locations in the first phase,after which they are all co-registered.The ROI,or road and its objects,are retrieved using the GrabCut method in the second phase.The third phase entails data preparation.The segmented images’noise is eliminated using Gaussian blur,after which the images are changed to grayscale and forwarded to the following stage for additional morphological procedures.The YOLOv3 algorithm is used in the fourth step to find any automobiles in the photos.Following that,the Kalman filter and centroid tracking are used to perform the tracking of the detected cars.The Lucas-Kanade method is then used to perform the trajectory analysis on the vehicles.The suggested model is put to the test and assessed using the Vehicle Aerial Imaging from Drone(VAID)dataset.For detection and tracking,the model was able to attain accuracy levels of 96.7%and 91.6%,respectively.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)Support Program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).The funding of this work was provided by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R410),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Road congestion,air pollution,and accident rates have all increased as a result of rising traffic density andworldwide population growth.Over the past ten years,the total number of automobiles has increased significantly over the world.In this paper,a novel method for intelligent traffic surveillance is presented.The proposed model is based on multilabel semantic segmentation using a random forest classifier which classifies the images into five classes.To improve the results,mean-shift clustering was applied to the segmented images.Afterward,the pixels given the label for the vehicle were extracted and blob detection was applied to mark each vehicle.For the validation of each detection,a vehicle verification method based on the structural similarity index is proposed.The tracking of vehicles across the image frames is done using the Identifier(ID)assignment technique and particle filter.Also,vehicle counting in each frame along with trajectory estimation was done for each object.Our proposed system demonstrated a remarkable vehicle detection rate of 0.83 over Vehicle Aerial Imaging from Drone(VAID),0.86 over AU-AIR,and 0.75 over the Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)dataset during the experimental evaluation.The proposed system can be used for several purposes,such as vehicle identification in traffic,traffic density estimation at intersections,and traffic congestion sensing on a road.
基金supported by a grant (2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation (NRF)funded by the Ministry of Education,Republic of Korea.
文摘Hand gesture recognition (HGR) is used in a numerous applications,including medical health-care, industrial purpose and sports detection.We have developed a real-time hand gesture recognition system using inertialsensors for the smart home application. Developing such a model facilitatesthe medical health field (elders or disabled ones). Home automation has alsobeen proven to be a tremendous benefit for the elderly and disabled. Residentsare admitted to smart homes for comfort, luxury, improved quality of life,and protection against intrusion and burglars. This paper proposes a novelsystem that uses principal component analysis, linear discrimination analysisfeature extraction, and random forest as a classifier to improveHGRaccuracy.We have achieved an accuracy of 94% over the publicly benchmarked HGRdataset. The proposed system can be used to detect hand gestures in thehealthcare industry as well as in the industrial and educational sectors.
基金This research was supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)This work has also been supported by PrincessNourah bint Abdulrahman UniversityResearchers Supporting Project Number(PNURSP2022R239),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Alsothis work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).In additionsupport of the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University,This work has also been supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R239),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Alsosupported by the Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘In the past two decades,there has been a lot of work on computer vision technology that incorporates many tasks which implement basic filtering to image classification.Themajor research areas of this field include object detection and object recognition.Moreover,wireless communication technologies are presently adopted and they have impacted the way of education that has been changed.There are different phases of changes in the traditional system.Perception of three-dimensional(3D)from two-dimensional(2D)image is one of the demanding tasks.Because human can easily perceive but making 3D using software will take time manually.Firstly,the blackboard has been replaced by projectors and other digital screens so such that people can understand the concept better through visualization.Secondly,the computer labs in schools are now more common than ever.Thirdly,online classes have become a reality.However,transferring to online education or e-learning is not without challenges.Therefore,we propose a method for improving the efficiency of e-learning.Our proposed system consists of twoand-a-half dimensional(2.5D)features extraction using machine learning and image processing.Then,these features are utilized to generate 3D mesh using ellipsoidal deformation method.After that,3D bounding box estimation is applied.Our results show that there is a need to move to 3D virtual reality(VR)with haptic sensors in the field of e-learning for a better understanding of real-world objects.Thus,people will have more information as compared to the traditional or simple online education tools.We compare our result with the ShapeNet dataset to check the accuracy of our proposed method.Our proposed system achieved an accuracy of 90.77%on plane class,85.72%on chair class,and car class have 72.14%.Mean accuracy of our method is 70.89%.
文摘In the last decade,there has been remarkable progress in the areas of object detection and recognition due to high-quality color images along with their depth maps provided by RGB-D cameras.They enable artificially intelligent machines to easily detect and recognize objects and make real-time decisions according to the given scenarios.Depth cues can improve the quality of object detection and recognition.The main purpose of this research study to find an optimized way of object detection and identification we propose techniques of object detection using two RGB-D datasets.The proposed methodology extracts image normally from depth maps and then performs clustering using the Modified Watson Mixture Model(mWMM).mWMM is challenging to handle when the quality of the image is not good.Hence,the proposed RGB-D-based system uses depth cues for segmentation with the help of mWMM.Then it extracts multiple features from the segmented images.The selected features are fed to the Artificial Neural Network(ANN)and Convolutional Neural Network(CNN)for detecting objects.We achieved 92.13%of mean accuracy over NYUv1 dataset and 90.00%of mean accuracy for the Redweb_v1 dataset.Finally,their results are compared and the proposed model with CNN outperforms other state-of-the-art methods.The proposed architecture can be used in autonomous cars,traffic monitoring,and sports scenes.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)+2 种基金Also,this work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115)Taif University,Taif,Saudi Arabia.This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R239)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of computer vision.Hence,developing a surveillance system with multiple object recognition and tracking,especially in low light and night-time,is still challenging.Therefore,we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night.In particular,we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared(IR)images using machine learning and tracking them using particle filters.Moreover,a random forest classifier is adopted for image segmentation to identify pedestrians in an image.The result of detection is investigated by particle filter to solve pedestrian tracking.Through the extensive experiment,our system shows 93%segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes.Moreover,the system achieved a detection accuracy of 90%usingmultiple templatematching techniques and 81%accuracy for pedestrian tracking.Furthermore,our system can identify that the detected object is a human.Hence,our system provided the best results compared to the state-ofart systems,which proves the effectiveness of the techniques used for image segmentation,classification,and tracking.The presented method is applicable for human detection/tracking,crowd analysis,and monitoring pedestrians in IR video surveillance.
基金partially supported by the Taif University Researchers Supporting Project number(TURSP-2020/115),Taif University,Taif,Saudi Arabiasupported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Crowd management becomes a global concern due to increased population in urban areas.Better management of pedestrians leads to improved use of public places.Behavior of pedestrian’s is a major factor of crowd management in public places.There are multiple applications available in this area but the challenge is open due to complexity of crowd and depends on the environment.In this paper,we have proposed a new method for pedestrian’s behavior detection.Kalman filter has been used to detect pedestrian’s usingmovement based approach.Next,we have performed occlusion detection and removal using region shrinking method to isolate occluded humans.Human verification is performed on each human silhouette and wavelet analysis and particle gradient motion are extracted for each silhouettes.Gray Wolf Optimizer(GWO)has been utilized to optimize feature set and then behavior classification has been performed using the Extreme Gradient(XG)Boost classifier.Performance has been evaluated using pedestrian’s data from avenue and UBI-Fight datasets,where both have different environment.The mean achieved accuracies are 91.3%and 85.14%over the Avenue and UBI-Fight datasets,respectively.These results are more accurate as compared to other existing methods.
基金The authors acknowledge the Deanship of Scientific Research at King Faisal University for the financial support under Nasher Track(Grant No.NA000239)this research was supported by a Grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘With the dramatic increase in video surveillance applications and public safety measures,the need for an accurate and effective system for abnormal/sus-picious activity classification also increases.Although it has multiple applications,the problem is very challenging.In this paper,a novel approach for detecting nor-mal/abnormal activity has been proposed.We used the Gaussian Mixture Model(GMM)and Kalmanfilter to detect and track the objects,respectively.After that,we performed shadow removal to segment an object and its shadow.After object segmentation we performed occlusion detection method to detect occlusion between multiple human silhouettes and we implemented a novel method for region shrinking to isolate occluded humans.Fuzzy c-mean is utilized to verify human silhouettes and motion based features including velocity and opticalflow are extracted for each identified silhouettes.Gray Wolf Optimizer(GWO)is used to optimize feature set followed by abnormal event classification that is performed using the XG-Boost classifier.This system is applicable in any surveillance appli-cation used for event detection or anomaly detection.Performance of proposed system is evaluated using University of Minnesota(UMN)dataset and UBI(Uni-versity of Beira Interior)-Fight dataset,each having different type of anomaly.The mean accuracy for the UMN and UBI-Fight datasets is 90.14%and 76.9%respec-tively.These results are more accurate as compared to other existing methods.
基金supported by a Grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘Due to the recently increased requirements of e-learning systems,multiple educational institutes such as kindergarten have transformed their learning towards virtual education.Automated student health exercise is a difficult task but an important one due to the physical education needs especially in young learners.The proposed system focuses on the necessary implementation of student health exercise recognition(SHER)using a modified Quaternion-basedfilter for inertial data refining and data fusion as the pre-processing steps.Further,cleansed data has been segmented using an overlapping windowing approach followed by patterns identification in the form of static and kinematic signal patterns.Furthermore,these patterns have been utilized to extract cues for both patterned signals,which are further optimized using Fisher’s linear discriminant analysis(FLDA)technique.Finally,the physical exercise activities have been categorized using extended Kalmanfilter(EKF)-based neural networks.This system can be implemented in multiple educational establishments including intelligent training systems,virtual mentors,smart simulations,and interactive learning management methods.
基金This research was supported by a grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘Latest advancements in vision technology offer an evident impact on multi-object recognition and scene understanding.Such sceneunderstanding task is a demanding part of several technologies,like augmented reality-based scene integration,robotic navigation,autonomous driving,and tourist guide.Incorporating visual information in contextually unified segments,convolution neural networks-based approaches will significantly mitigate the clutter,which is usual in classical frameworks during scene understanding.In this paper,we propose a convolutional neural network(CNN)based segmentation method for the recognition of multiple objects in an image.Initially,after acquisition and preprocessing,the image is segmented by using CNN.Then,CNN features are extracted from these segmented objects,and discrete cosine transform(DCT)and discrete wavelet transform(DWT)features are computed.After the extraction of CNN features and computation of classical machine learning features,fusion is performed using a fusion technique.Then,to select theminimal set of features,genetic algorithm-based feature selection is used.In order to recognize and understand the multi-objects in the scene,a neuro-fuzzy approach is applied.Once objects in the scene are recognized,the relationship between these objects is examined by employing the object-to-object relation approach.Finally,a decision tree is incorporated to assign the relevant labels to the scenes based on recognized objects in the image.The experimental results over complex scene datasets including SUN Red Green Blue-Depth(RGB-D)and Cityscapes’demonstrated a remarkable performance.
基金This research was supported by a Grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘In this research work,an efficient sign language recognition tool for e-learning has been proposed with a new type of feature set based on angle and lines.This feature set has the ability to increase the overall performance of machine learning algorithms in an efficient way.The hand gesture recognition based on these features has been implemented for usage in real-time.The feature set used hand landmarks,which were generated using media-pipe(MediaPipe)and open computer vision(openCV)on each frame of the incoming video.The overall algorithm has been tested on two well-known ASLalphabet(American Sign Language)and ISL-HS(Irish Sign Language)sign language datasets.Different machine learning classifiers including random forest,decision tree,and naïve Bayesian have been used to classify hand gestures using this unique feature set and their respective results have been compared.Since the random forest classifier performed better,it has been selected as the base classifier for the proposed system.It showed 96.7%accuracy with ISL-HS and 93.7%accuracy with ASL-alphabet dataset using the extracted features.
基金This research was supported by a grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘E-learning approaches are one of the most important learning platforms for the learner through electronic equipment.Such study techniques are useful for other groups of learners such as the crowd,pedestrian,sports,transports,communication,emergency services,management systems and education sectors.E-learning is still a challenging domain for researchers and developers to find new trends and advanced tools and methods.Many of them are currently working on this domain to fulfill the requirements of industry and the environment.In this paper,we proposed a method for pedestrian behavior mining of aerial data,using deep flow feature,graph mining technique,and convocational neural network.For input data,the state-of-the-art crowd activity University of Minnesota(UMN)dataset is adopted,which contains the aerial indoor and outdoor view of the pedestrian,for simplification of extra information and computational cost reduction the pre-processing is applied.Deep flow features are extracted to find more accurate information.Furthermore,to deal with repetition in features data and features mining the graph mining algorithm is applied,while Convolution Neural Network(CNN)is applied for pedestrian behavior mining.The proposed method shows 84.50%of mean accuracy and a 15.50%of error rate.Therefore,the achieved results show more accuracy as compared to state-ofthe-art classification algorithms such as decision tree,artificial neural network(ANN).
基金This research was supported by a grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘Over the last decade,there is a surge of attention in establishing ambient assisted living(AAL)solutions to assist individuals live independently.With a social and economic perspective,the demographic shift toward an elderly population has brought new challenges to today’s society.AAL can offer a variety of solutions for increasing people’s quality of life,allowing them to live healthier and more independently for longer.In this paper,we have proposed a novel AAL solution using a hybrid bidirectional long-term and short-term memory networks(BiLSTM)and convolutional neural network(CNN)classifier.We first pre-processed the signal data,then used timefrequency features such as signal energy,signal variance,signal frequency,empirical mode,and empirical mode decomposition.The convolutional neural network-bidirectional long-term and short-term memory(CNN-biLSTM)classifier with dimensional reduction isomap algorithm was then used to select ideal features.We assessed the performance of our proposed system on the publicly accessible human gait database(HuGaDB)benchmark dataset and achieved an accuracy rates of 93.95 percent,respectively.Experiments reveal that hybrid method gives more accuracy than single classifier in AAL model.The suggested system can assists persons with impairments,assisting carers and medical personnel.
基金This research was supported by a grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘Nowadays,activities of daily living(ADL)recognition system has been considered an important field of computer vision.Wearable and optical sensors are widely used to assess the daily living activities in healthy people and people with certain disorders.Although conventional ADL utilizes RGB optical sensors but an RGB-D camera with features of identifying depth(distance information)and visual cues has greatly enhanced the performance of activity recognition.In this paper,an RGB-D-based ADL recognition system has been presented.Initially,human silhouette has been extracted from the noisy background of RGB and depth images to track human movement in a scene.Based on these silhouettes,full body features and point based features have been extracted which are further optimized with probability based incremental learning(PBIL)algorithm.Finally,random forest classifier has been used to classify activities into different categories.The n-fold crossvalidation scheme has been used to measure the viability of the proposed model on the RGBD-AC benchmark dataset and has achieved an accuracy of 92.71%over other state-of-the-art methodologies.