期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Multimodal Spatiotemporal Feature Map for Dynamic Gesture Recognition
1
作者 Xiaorui Zhang Xianglong Zeng +2 位作者 Wei Sun Yongjun Ren Tong Xu 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期671-686,共16页
Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually us... Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually use convolutional neural networks to extract features directly from raw gesture data for gesture recognition,but the networks are affected by much interference information in the input data and thus fit to some unimportant features.In this paper,we proposed a novel method for encoding spatio-temporal information,which can enhance the key features required for gesture recognition,such as shape,structure,contour,position and hand motion of gestures,thereby improving the accuracy of gesture recognition.This encoding method can encode arbitrarily multiple frames of gesture data into a single frame of the spatio-temporal feature map and use the spatio-temporal feature map as the input to the neural network.This can guide the model to fit important features while avoiding the use of complex recurrent network structures to extract temporal features.In addition,we designed two sub-networks and trained the model using a sub-network pre-training strategy that trains the sub-networks first and then the entire network,so as to avoid the subnetworks focusing too much on the information of a single category feature and being overly influenced by each other’s features.Experimental results on two public gesture datasets show that the proposed spatio-temporal information encoding method achieves advanced accuracy. 展开更多
关键词 dynamic gesture recognition spatio-temporal information encoding multimodal input pre-training score fusion
下载PDF
End-to-End Multiview Gesture Recognition for Autonomous Car Parking System
2
作者 Hassene Ben AMARA Fakhri KARRAY 《Instrumentation》 2019年第3期76-92,共17页
The use of hand gestures can be the most intuitive human-machine interaction medium.The early approaches for hand gesture recognition used device-based methods.These methods use mechanical or optical sensors attached ... The use of hand gestures can be the most intuitive human-machine interaction medium.The early approaches for hand gesture recognition used device-based methods.These methods use mechanical or optical sensors attached to a glove or markers,which hinder the natural human-machine communication.On the other hand,vision-based methods are less restrictive and allow for a more spontaneous communication without the need of an intermediary between human and machine.Therefore,vision gesture recognition has been a popular area of research for the past thirty years.Hand gesture recognition finds its application in many areas,particularly the automotive industry where advanced automotive human-machine interface(HMI)designers are using gesture recognition to improve driver and vehicle safety.However,technology advances go beyond active/passive safety and into convenience and comfort.In this context,one of America’s big three automakers has partnered with the Centre of Pattern Analysis and Machine Intelligence(CPAMI)at the University of Waterloo to investigate expanding their product segment through machine learning to provide an increased driver convenience and comfort with the particular application of hand gesture recognition for autonomous car parking.The present paper leverages the state-of-the-art deep learning and optimization techniques to develop a vision-based multiview dynamic hand gesture recognizer for a self-parking system.We propose a 3D-CNN gesture model architecture that we train on a publicly available hand gesture database.We apply transfer learning methods to fine-tune the pre-trained gesture model on custom-made data,which significantly improves the proposed system performance in a real world environment.We adapt the architecture of end-to-end solution to expand the state-of-the-art video classifier from a single image as input(fed by monocular camera)to a Multiview 360 feed,offered by a six cameras module.Finally,we optimize the proposed solution to work on a limited resource embedded platform(Nvidia Jetson TX2)that is used by automakers for vehicle-based features,without sacrificing the accuracy robustness and real time functionality of the system. 展开更多
关键词 Deep Learning Video Classification dynamic Hand gesture recognition Multiview Embedded Platform AUTOMOTIVE Vehicle Self-Parking
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部