期刊文献+
共找到81,084篇文章
< 1 2 250 >
每页显示 20 50 100
Real-Time Recognition and Location of Indoor Objects
1
作者 Jinxing Niu Qingsheng Hu +2 位作者 Yi Niu Tao Zhang Sunil Kumar Jha 《Computers, Materials & Continua》 SCIE EI 2021年第8期2221-2229,共9页
Object recognition and location has always been one of the research hotspots in machine vision.It is of great value and significance to the development and application of current service robots,industrial automation,u... Object recognition and location has always been one of the research hotspots in machine vision.It is of great value and significance to the development and application of current service robots,industrial automation,unmanned driving and other fields.In order to realize the real-time recognition and location of indoor scene objects,this article proposes an improved YOLOv3 neural network model,which combines densely connected networks and residual networks to construct a new YOLOv3 backbone network,which is applied to the detection and recognition of objects in indoor scenes.In this article,RealSense D415 RGB-D camera is used to obtain the RGB map and depth map,the actual distance value is calculated after each pixel in the scene image is mapped to the real scene.Experiment results proved that the detection and recognition accuracy and real-time performance by the new network are obviously improved compared with the previous YOLOV3 neural network model in the same scene.More objects can be detected after the improvement of network which cannot be detected with the YOLOv3 network before the improvement.The running time of objects detection and recognition is reduced to less than half of the original.This improved network has a certain reference value for practical engineering application. 展开更多
关键词 Object recognition improved YOLOv3 network RGB-D camera object location
下载PDF
Real-Time Object Detection and Face Recognition Application for the Visually Impaired
2
作者 Karshiev Sanjar Soyoun Bang +1 位作者 SookheeRyue Heechul Jung 《Computers, Materials & Continua》 SCIE EI 2024年第6期3569-3583,共15页
The advancement of navigation systems for the visually impaired has significantly enhanced their mobility by mitigating the risk of encountering obstacles and guiding them along safe,navigable routes.Traditional appro... The advancement of navigation systems for the visually impaired has significantly enhanced their mobility by mitigating the risk of encountering obstacles and guiding them along safe,navigable routes.Traditional approaches primarily focus on broad applications such as wayfinding,obstacle detection,and fall prevention.However,there is a notable discrepancy in applying these technologies to more specific scenarios,like identifying distinct food crop types or recognizing faces.This study proposes a real-time application designed for visually impaired individuals,aiming to bridge this research-application gap.It introduces a system capable of detecting 20 different food crop types and recognizing faces with impressive accuracies of 83.27%and 95.64%,respectively.These results represent a significant contribution to the field of assistive technologies,providing visually impaired users with detailed and relevant information about their surroundings,thereby enhancing their mobility and ensuring their safety.Additionally,it addresses the vital aspects of social engagements,acknowledging the challenges faced by visually impaired individuals in recognizing acquaintances without auditory or tactile signals,and highlights recent developments in prototype systems aimed at assisting with face recognition tasks.This comprehensive approach not only promises enhanced navigational aids but also aims to enrich the social well-being and safety of visually impaired communities. 展开更多
关键词 Artificial intelligence deep learning real-time object detection application
下载PDF
Virtual Keyboard:A Real-Time Hand Gesture Recognition-Based Character Input System Using LSTM and Mediapipe Holistic
3
作者 Bijon Mallik Md Abdur Rahim +2 位作者 Abu Saleh Musa Miah Keun Soo Yun Jungpil Shin 《Computer Systems Science & Engineering》 2024年第2期555-570,共16页
In the digital age,non-touch communication technologies are reshaping human-device interactions and raising security concerns.A major challenge in current technology is the misinterpretation of gestures by sensors and... In the digital age,non-touch communication technologies are reshaping human-device interactions and raising security concerns.A major challenge in current technology is the misinterpretation of gestures by sensors and cameras,often caused by environmental factors.This issue has spurred the need for advanced data processing methods to achieve more accurate gesture recognition and predictions.Our study presents a novel virtual keyboard allowing character input via distinct hand gestures,focusing on two key aspects:hand gesture recognition and character input mechanisms.We developed a novel model with LSTM and fully connected layers for enhanced sequential data processing and hand gesture recognition.We also integrated CNN,max-pooling,and dropout layers for improved spatial feature extraction.This model architecture processes both temporal and spatial aspects of hand gestures,using LSTM to extract complex patterns from frame sequences for a comprehensive understanding of input data.Our unique dataset,essential for training the model,includes 1,662 landmarks from dynamic hand gestures,33 postures,and 468 face landmarks,all captured in real-time using advanced pose estimation.The model demonstrated high accuracy,achieving 98.52%in hand gesture recognition and over 97%in character input across different scenarios.Its excellent performance in real-time testing underlines its practicality and effectiveness,marking a significant advancement in enhancing human-device interactions in the digital age. 展开更多
关键词 Hand gesture recognition M.P.holistic open CV virtual keyboard LSTM human-computer interaction
下载PDF
Real-Time Face Tracking and Recognition in Video Sequence 被引量:3
4
作者 徐一华 贾云得 +1 位作者 刘万春 杨聪 《Journal of Beijing Institute of Technology》 EI CAS 2002年第2期203-207,共5页
A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techni... A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC. 展开更多
关键词 face tracking pattern recognition skin color based eigenface/PCA artificial neural network
下载PDF
Multi-Stage-Based Siamese Neural Network for Seal Image Recognition
5
作者 Jianfeng Lu Xiangye Huang +3 位作者 Caijin Li Renlin Xin Shanqing Zhang Mahmoud Emam 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期405-423,共19页
Seal authentication is an important task for verifying the authenticity of stamped seals used in various domains to protect legal documents from tampering and counterfeiting.Stamped seal inspection is commonly audited... Seal authentication is an important task for verifying the authenticity of stamped seals used in various domains to protect legal documents from tampering and counterfeiting.Stamped seal inspection is commonly audited manually to ensure document authenticity.However,manual assessment of seal images is tedious and laborintensive due to human errors,inconsistent placement,and completeness of the seal.Traditional image recognition systems are inadequate enough to identify seal types accurately,necessitating a neural network-based method for seal image recognition.However,neural network-based classification algorithms,such as Residual Networks(ResNet)andVisualGeometryGroup with 16 layers(VGG16)yield suboptimal recognition rates on stamp datasets.Additionally,the fixed training data categories make handling new categories to be a challenging task.This paper proposes amulti-stage seal recognition algorithmbased on Siamese network to overcome these limitations.Firstly,the seal image is pre-processed by applying an image rotation correction module based on Histogram of Oriented Gradients(HOG).Secondly,the similarity between input seal image pairs is measured by utilizing a similarity comparison module based on the Siamese network.Finally,we compare the results with the pre-stored standard seal template images in the database to obtain the seal type.To evaluate the performance of the proposed method,we further create a new seal image dataset that contains two subsets with 210,000 valid labeled pairs in total.The proposed work has a practical significance in industries where automatic seal authentication is essential as in legal,financial,and governmental sectors,where automatic seal recognition can enhance document security and streamline validation processes.Furthermore,the experimental results show that the proposed multi-stage method for seal image recognition outperforms state-of-the-art methods on the two established datasets. 展开更多
关键词 Seal recognition seal authentication document tampering siamese network spatial transformer network similarity comparison network
下载PDF
Real-time recognition of human lower-limb locomotion based on exponential coordinates of relative rotations 被引量:5
6
作者 XU Sen DING Ye 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2021年第7期1423-1435,共13页
This paper proposes a data-driven method using a parametric representation of relative rotations to classify human lower-limb locomotion, which is designed for wearable robots. Three Inertial measurement units(IMUs) a... This paper proposes a data-driven method using a parametric representation of relative rotations to classify human lower-limb locomotion, which is designed for wearable robots. Three Inertial measurement units(IMUs) are mounted on the subject's waist,left knee, and right knee, respectively. Features for classification comprise relative rotations, angular velocities, and waist acceleration. Those relative rotations are represented by the exponential coordinates. The rotation matrices are normalized by Karcher mean and then the Support Vector Machine(SVM) method is used to train the data. Experiments are conducted with a time-window size of less than 40 ms. Three SVM classifiers telling 3, 5 and 6 rough lower-limb locomotion types respectively are trained, and the average accuracies are all over 98%. With the combination of those rough SVM classifiers, an easy SVMbased ensembled-system is proposed to classify 16 fine locomotion types, achieving the average accuracy of 98.22%, and its latency is 18 ms when deployed to an onboard computer. 展开更多
关键词 locomotion recognition exponential coordinate relative rotation support vector machine
原文传递
Real-time recognition of sows in video: A supervised approach 被引量:4
7
作者 Ehsan Khoramshahi Juha Hietaoja +2 位作者 Anna Valros Jinhyeon Yun Matti Pastell 《Information Processing in Agriculture》 EI 2014年第1期73-81,共9页
This paper proposes a supervised classification approach for the real-time pattern recognition of sows in an animal supervision system(asup).Our approach offers the possibility of the foreground subtraction in an asup... This paper proposes a supervised classification approach for the real-time pattern recognition of sows in an animal supervision system(asup).Our approach offers the possibility of the foreground subtraction in an asup’s image processing module where there is lack of statistical information regarding the background.A set of 7 farrowing sessions of sows,during day and night,have been captured(approximately 7 days/sow),which is used for this study.The frames of these recordings have been grabbed with a time shift of 20 s.A collection of 215 frames of 7 different sows with the same lighting condition have been marked and used as the training set.Based on small neighborhoods around a point,a number of image local features are defined,and their separability and performance metrics are compared.For the classification task,a feed-forward neural network(NN)is studied and a realistic configuration in terms of an acceptable level of accuracy and computation time is chosen.The results show that the dense neighborhood feature(d.3×3)is the smallest local set of features with an acceptable level of separability,while it has no negative effect on the complexity of NN.The results also confirm that a significant amount of the desired pattern is accurately detected,even in situations where a portion of the body of a sow is covered by the crate’s elements.The performance of the proposed feature set coupled with our chosen configuration reached the rate of 8.5 fps.The true positive rate(TPR)of the classifier is 84.6%,while the false negative rate(FNR)is only about 3%.A comparison between linear logistic regression and NN shows the highly non-linear nature of our proposed set of features. 展开更多
关键词 Precision farming Supervised classification real-time image-processing Neural network
原文传递
Real-Time Violent Action Recognition Using Key Frames Extraction and Deep Learning 被引量:2
8
作者 Muzamil Ahmed Muhammad Ramzan +5 位作者 Hikmat Ullah Khan Saqib Iqbal Muhammad Attique Khan Jung-In Choi Yunyoung Nam Seifedine Kadry 《Computers, Materials & Continua》 SCIE EI 2021年第11期2217-2230,共14页
Violence recognition is crucial because of its applications in activities related to security and law enforcement.Existing semi-automated systems have issues such as tedious manual surveillances,which causes human err... Violence recognition is crucial because of its applications in activities related to security and law enforcement.Existing semi-automated systems have issues such as tedious manual surveillances,which causes human errors and makes these systems less effective.Several approaches have been proposed using trajectory-based,non-object-centric,and deep-learning-based methods.Previous studies have shown that deep learning techniques attain higher accuracy and lower error rates than those of other methods.However,the their performance must be improved.This study explores the state-of-the-art deep learning architecture of convolutional neural networks(CNNs)and inception V4 to detect and recognize violence using video data.In the proposed framework,the keyframe extraction technique eliminates duplicate consecutive frames.This keyframing phase reduces the training data size and hence decreases the computational cost by avoiding duplicate frames.For feature selection and classification tasks,the applied sequential CNN uses one kernel size,whereas the inception v4 CNN uses multiple kernels for different layers of the architecture.For empirical analysis,four widely used standard datasets are used with diverse activities.The results confirm that the proposed approach attains 98%accuracy,reduces the computational cost,and outperforms the existing techniques of violence detection and recognition. 展开更多
关键词 Violence detection violence recognition deep learning convolutional neural network inception v4 keyframe extraction
下载PDF
A Statistical Framework for Real-Time Traffic Accident Recognition 被引量:1
9
作者 Samy Sadek Ayoub Al-Hamadi +1 位作者 Bernd Michaelis Usama Sayed 《Journal of Signal and Information Processing》 2010年第1期77-81,共5页
Over the past decade, automatic traffic accident recognition has become a prominent objective in the area of machine vision and pattern recognition because of its immense application potential in developing autonomous... Over the past decade, automatic traffic accident recognition has become a prominent objective in the area of machine vision and pattern recognition because of its immense application potential in developing autonomous Intelligent Transportation Systems (ITS). In this paper, we present a new framework toward a real-time automated recognition of traffic accident based on the Histogram of Flow Gradient (HFG) and statistical logistic regression analysis. First, optical flow is estimated and the HFG is constructed from video shots. Then vehicle patterns are clustered based on the HFG-features. By using logistic regression analysis to fit data to logistic curves, the classifier model is generated. Finally, the trajectory of the vehicle by which the accident was occasioned, is determined and recorded. The experimental results on real video sequences demonstrate the efficiency and the applicability of the framework and show it is of higher robustness and can comfortably provide latency guarantees to real-time surveillance and traffic monitoring applications. 展开更多
关键词 Activity PATTERN Automatic TRAFFIC ACCIDENT recognition Flow GRADIENT LOGISTIC Model
下载PDF
Simple Human Gesture Detection and Recognition Using a Feature Vector and a Real-Time Histogram Based Algorithm 被引量:1
10
作者 Iván Gómez-Conde David Olivieri +1 位作者 Xosé Antón Vila Stella Orozco-Ochoa 《Journal of Signal and Information Processing》 2011年第4期279-286,共8页
Gesture and action recognition for video surveillance is an active field of computer vision. Nowadays, there are several techniques that attempt to address this problem by 3D mapping with a high computational cost. Th... Gesture and action recognition for video surveillance is an active field of computer vision. Nowadays, there are several techniques that attempt to address this problem by 3D mapping with a high computational cost. This paper describes software algorithms that can detect the persons in the scene and analyze different actions and gestures in real time. The motivation of this paper is to create a system for thetele-assistance of elderly, which could be used as early warning monitor for anomalous events like falls or excessively long periods of inactivity. We use a method for foreg-round-background segmentation and create a feature vectorfor discriminating and tracking several people in the scene. Finally, a simple real-time histogram based algorithm is described for discriminating gestures and body positions through a K-Means clustering. 展开更多
关键词 COMPUTER VISION Foreground Segmentation Object Detection and Tracking GESTURE recognition Tele-Assistance TELECARE
下载PDF
An Improved Real-Time Face Recognition System at Low Resolution Based on Local Binary Pattern Histogram Algorithm and CLAHE 被引量:2
11
作者 Kamal Chandra Paul Semih Aslan 《Optics and Photonics Journal》 2021年第4期63-78,共16页
This research presents an improved real-time face recognition system at a low<span><span><span style="font-family:" color:red;"=""> </span></span></span><... This research presents an improved real-time face recognition system at a low<span><span><span style="font-family:" color:red;"=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">resolution of 15 pixels with pose and emotion and resolution variations. We have designed our datasets named LRD200 and LRD100, which have been used for training and classification. The face detection part uses the Viola-Jones algorithm, and the face recognition part receives the face image from the face detection part to process it using the Local Binary Pattern Histogram (LBPH) algorithm with preprocessing using contrast limited adaptive histogram equalization (CLAHE) and face alignment. The face database in this system can be updated via our custom-built standalone android app and automatic restarting of the training and recognition process with an updated database. Using our proposed algorithm, a real-time face recognition accuracy of 78.40% at 15</span></span></span><span><span><span style="font-family:;" "=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">px and 98.05% at 45</span></span></span><span><span><span style="font-family:;" "=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">px have been achieved using the LRD200 database containing 200 images per person. With 100 images per person in the database (LRD100) the achieved accuracies are 60.60% at 15</span></span></span><span><span><span style="font-family:;" "=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">px and 95% at 45</span></span></span><span><span><span style="font-family:;" "=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">px respectively. A facial deflection of about 30</span></span></span><span><span><span><span><span style="color:#4F4F4F;font-family:-apple-system, " font-size:16px;white-space:normal;background-color:#ffffff;"="">°</span></span><span> on either side from the front face showed an average face recognition precision of 72.25%-81.85%. This face recognition system can be employed for law enforcement purposes, where the surveillance camera captures a low-resolution image because of the distance of a person from the camera. It can also be used as a surveillance system in airports, bus stations, etc., to reduce the risk of possible criminal threats.</span></span></span></span> 展开更多
关键词 Face Detection Face recognition Low Resolution Feature Extraction Security System Access Control System Viola-Jones Algorithm LBPH Local Binary Pattern Histogram
下载PDF
Transmission Considerations with QoS Support to Deliver Real-Time Distributed Speech Recognition Applications
12
作者 Zhu Xiao-gang Zhu Hong-wen Rong Meng-tian 《Wuhan University Journal of Natural Sciences》 EI CAS 2002年第1期65-70,共6页
Distributed speech recognition (DSR) applications have certain QoS (Quality of service) requirements in terms of latency, packet loss rate, etc. To deliver quality guaranteed DSR application over wirelined or wireless... Distributed speech recognition (DSR) applications have certain QoS (Quality of service) requirements in terms of latency, packet loss rate, etc. To deliver quality guaranteed DSR application over wirelined or wireless links, some QoS mechanisms should be provided. We put forward a RTP/RSVP transmission scheme with DSR-specific payload and QoS parameters by modifying the present WAP protocol stack. The simulation result shows that this scheme will provide adequate network bandwidth to keep the real-time transport of DSR data over either wirelined or wireless channels. 展开更多
关键词 distributed speech recognition quality of service real-time transmission protocol resource reservation protocol wireless application protocol
下载PDF
Resource Efficient Hardware Implementation for Real-Time Traffic Sign Recognition
13
作者 Huai-Mao Weng Ching-Te Chiu 《Journal of Transportation Technologies》 2018年第3期209-231,共23页
Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource effic... Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource efficiency, we propose a high efficiency hardware implementation for TSR. We divide the TSR procedure into two stages, detection and recognition. In the detection stage, under the assumption that most German traffic signs have red or blue colors with circle, triangle or rectangle shapes, we use Normalized RGB color transform and Single-Pass Connected Component Labeling (CCL) to find the potential traffic signs efficiently. For Single-Pass CCL, our contribution is to eliminate the “merge-stack” operations by recording connected relations of region in the scan phase and updating the labels in the iterating phase. In the recognition stage, the Histogram of Oriented Gradient (HOG) is used to generate the descriptor of the signs, and we classify the signs with Support Vector Machine (SVM). In the HOG module, we analyze the required minimum bits under different recognition rate. The proposed method achieves 96.61% detection rate and 90.85% recognition rate while testing with the GTSDB dataset. Our hardware implementation reduces the storage of CCL and simplifies the HOG computation. Main CCL storage size is reduced by 20% comparing to the most advanced design under typical condition. By using TSMC 90 nm technology, the proposed design operates at 105 MHz clock rate and processes in 135 fps with the image size of 1360 × 800. The chip size is about 1 mm2 and the power consumption is close to 8 mW. Therefore, this work is resource efficient and achieves real-time requirement. 展开更多
关键词 TRAFFIC SIGN recognition Advanced Driver ASSISTANCE System real-time Processing Color Segmentation Connected Component Analysis Histo-gram of Oriented Gradient Support Vector Machine German TRAFFIC SIGN Detection BENCHMARK CMOS ASIC VLSI
下载PDF
A Real-time Face/Hand Tracking Method for Chinese Sign Language Recognition
14
作者 刘晋东 Yuan +4 位作者 Kui Zou Wei Luo Bencheng 《High Technology Letters》 EI CAS 2002年第4期80-84,共5页
This paper introduces a new Chinese Sign Language recognition (CSLR) system and a method of real time tracking face and hand applied in the system. In the method, an improved agent algorithm is used to extract the reg... This paper introduces a new Chinese Sign Language recognition (CSLR) system and a method of real time tracking face and hand applied in the system. In the method, an improved agent algorithm is used to extract the region of face and hand and track them. Kalman filter is introduced to forecast the position and rectangle of search, and self adapting of target color is designed to counteract the effect of illumination. 展开更多
关键词 CSLR improved agent real-time tracking KALMAN FILTER
下载PDF
Real-Time Face Detection and Recognition in Complex Background
15
作者 Xin Zhang Thomas Gonnot Jafar Saniie 《Journal of Signal and Information Processing》 2017年第2期99-112,共14页
This paper provides efficient and robust algorithms for real-time face detection and recognition in complex backgrounds. The algorithms are implemented using a series of signal processing methods including Ada Boost, ... This paper provides efficient and robust algorithms for real-time face detection and recognition in complex backgrounds. The algorithms are implemented using a series of signal processing methods including Ada Boost, cascade classifier, Local Binary Pattern (LBP), Haar-like feature, facial image pre-processing and Principal Component Analysis (PCA). The Ada Boost algorithm is implemented in a cascade classifier to train the face and eye detectors with robust detection accuracy. The LBP descriptor is utilized to extract facial features for fast face detection. The eye detection algorithm reduces the false face detection rate. The detected facial image is then processed to correct the orientation and increase the contrast, therefore, maintains high facial recognition accuracy. Finally, the PCA algorithm is used to recognize faces efficiently. Large databases with faces and non-faces images are used to train and validate face detection and facial recognition algorithms. The algorithms achieve an overall true-positive rate of 98.8% for face detection and 99.2% for correct facial recognition. 展开更多
关键词 FACE Detection FACIAL recognition ADA BOOST Algorithm CASCADE CLASSIFIER Local Binary Pattern Haar-Like Features Principal Component Analysis
下载PDF
Real-time Health Condition Evaluation on Wind Turbines Based on Operational Condition Recognition 被引量:12
16
作者 DONG Yuliang LI Yaqiong +2 位作者 CAO Haibin HE Chengbing GU Yujiong 《中国电机工程学报》 EI CSCD 北大核心 2013年第11期I0013-I0013,15,共1页
针对大型风电机组运行工况和状态信息复杂,健康状态难以准确评价的问题,提出基于工况辨识的健康状态实时评价方法。该方法充分考虑机组运行工况的复杂性和多变性,采用工况辨识实现运行工况空间的划分。在各运行工况子空间,建立基于... 针对大型风电机组运行工况和状态信息复杂,健康状态难以准确评价的问题,提出基于工况辨识的健康状态实时评价方法。该方法充分考虑机组运行工况的复杂性和多变性,采用工况辨识实现运行工况空间的划分。在各运行工况子空间,建立基于高斯混合模型(gaussianmixturemodel,GMM)多状态特征融合的健康状态评价模型。采用健康衰退指数(healthdegradationindex,HDI)作为机组健康状态评价指标,并给出健康衰退报警限的确定方法。该方法用于某1.5MW风电机组传动系统故障前的健康状态评价。结果表明,该方法提前监测到机组健康状态的衰退趋势,可实现故障的早期预报,避免严重故障发生,并为合理调整运行和安排维修提供依据。 展开更多
关键词 风力涡轮机 健康状况 状态识别 评估 实时 运行 评价方法 状态信息
下载PDF
SlowFast Based Real-Time Human Motion Recognition with Action Localization
17
作者 Gyu-Il Kim Hyun Yoo Kyungyong Chung 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期2135-2152,共18页
Artificial intelligence is increasingly being applied in the field of video analysis,particularly in the area of public safety where video surveillance equipment such as closed-circuit television(CCTV)is used and auto... Artificial intelligence is increasingly being applied in the field of video analysis,particularly in the area of public safety where video surveillance equipment such as closed-circuit television(CCTV)is used and automated analysis of video information is required.However,various issues such as data size limitations and low processing speeds make real-time extraction of video data challenging.Video analysis technology applies object classification,detection,and relationship analysis to continuous 2D frame data,and the various meanings within the video are thus analyzed based on the extracted basic data.Motion recognition is key in this analysis.Motion recognition is a challenging field that analyzes human body movements,requiring the interpretation of complex movements of human joints and the relationships between various objects.The deep learning-based human skeleton detection algorithm is a representative motion recognition algorithm.Recently,motion analysis models such as the SlowFast network algorithm,have also been developed with excellent performance.However,these models do not operate properly in most wide-angle video environments outdoors,displaying low response speed,as expected from motion classification extraction in environments associated with high-resolution images.The proposed method achieves high level of extraction and accuracy by improving SlowFast’s input data preprocessing and data structure methods.The input data are preprocessed through object tracking and background removal using YOLO and DeepSORT.A higher performance than that of a single model is achieved by improving the existing SlowFast’s data structure into a frame unit structure.Based on the confusion matrix,accuracies of 70.16%and 70.74%were obtained for the existing SlowFast and proposed model,respectively,indicating a 0.58%increase in accuracy.Comparing detection,based on behavioral classification,the existing SlowFast detected 2,341,164 cases,whereas the proposed model detected 3,119,323 cases,which is an increase of 33.23%. 展开更多
关键词 Artificial intelligence convolutional neural network video analysis human action recognition skeleton extraction
下载PDF
Enhancing Human-Machine Interaction:Real-Time Emotion Recognition through Speech Analysis
18
作者 Dominik Esteves de Andrade Rüdiger Buchkremer 《Journal of Computer Science Research》 2023年第3期22-45,共24页
Humans,as intricate beings driven by a multitude of emotions,possess a remarkable ability to decipher and respond to socio-affective cues.However,many individuals and machines struggle to interpret such nuanced signal... Humans,as intricate beings driven by a multitude of emotions,possess a remarkable ability to decipher and respond to socio-affective cues.However,many individuals and machines struggle to interpret such nuanced signals,including variations in tone of voice.This paper explores the potential of intelligent technologies to bridge this gap and improve the quality of conversations.In particular,the authors propose a real-time processing method that captures and evaluates emotions in speech,utilizing a terminal device like the Raspberry Pi computer.Furthermore,the authors provide an overview of the current research landscape surrounding speech emotional recognition and delve into our methodology,which involves analyzing audio files from renowned emotional speech databases.To aid incomprehension,the authors present visualizations of these audio files in situ,employing dB-scaled Mel spectrograms generated through TensorFlow and Matplotlib.The authors use a support vector machine kernel and a Convolutional Neural Network with transfer learning to classify emotions.Notably,the classification accuracies achieved are 70% and 77%,respectively,demonstrating the efficacy of our approach when executed on an edge device rather than relying on a server.The system can evaluate pure emotion in speech and provide corresponding visualizations to depict the speaker’s emotional state in less than one second on a Raspberry Pi.These findings pave the way for more effective and emotionally intelligent human-machine interactions in various domains. 展开更多
关键词 Speech emotion recognition Edge computing real-time computing Raspberry Pi
下载PDF
BCCLR:A Skeleton-Based Action Recognition with Graph Convolutional Network Combining Behavior Dependence and Context Clues 被引量:4
19
作者 Yunhe Wang Yuxin Xia Shuai Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4489-4507,共19页
In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal ... In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal graph.Most GCNs define the graph topology by physical relations of the human joints.However,this predefined graph ignores the spatial relationship between non-adjacent joint pairs in special actions and the behavior dependence between joint pairs,resulting in a low recognition rate for specific actions with implicit correlation between joint pairs.In addition,existing methods ignore the trend correlation between adjacent frames within an action and context clues,leading to erroneous action recognition with similar poses.Therefore,this study proposes a learnable GCN based on behavior dependence,which considers implicit joint correlation by constructing a dynamic learnable graph with extraction of specific behavior dependence of joint pairs.By using the weight relationship between the joint pairs,an adaptive model is constructed.It also designs a self-attention module to obtain their inter-frame topological relationship for exploring the context of actions.Combining the shared topology and the multi-head self-attention map,the module obtains the context-based clue topology to update the dynamic graph convolution,achieving accurate recognition of different actions with similar poses.Detailed experiments on public datasets demonstrate that the proposed method achieves better results and realizes higher quality representation of actions under various evaluation protocols compared to state-of-the-art methods. 展开更多
关键词 Action recognition deep learning GCN behavior dependence context clue self-attention
下载PDF
Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network 被引量:1
20
作者 Arnab Dey Samit Biswas Dac-Nhuong Le 《Computers, Materials & Continua》 SCIE EI 2024年第5期3067-3087,共21页
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i... Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis. 展开更多
关键词 Workout action recognition video stream action recognition residual network GRU ATTENTION
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部