Aiming at people with hearing and speaking obstacles,this paper pro-poses a multialgorithm fusion for gesture recognition.This paper aims to more clearly distinguish easily confused gestures in gesture recognition and...Aiming at people with hearing and speaking obstacles,this paper pro-poses a multialgorithm fusion for gesture recognition.This paper aims to more clearly distinguish easily confused gestures in gesture recognition and improve gesture recognition accuracy by integrating lip-reading recognition.For gesture recognition,this paperfirst performs skin color processing and segmentation on the hand area of the collected video sequence and detects the hand feature points by calling the hand key point model.The extracted gesture features are trained and recognized by the support vector machine algorithm.For lip reading recognition,this paperfirst uses the AdaBoost algorithm to detect and track key points on the collected video sequence,locate the lips,extract the key points of the lips through a convolutional neural network,and input the extracted key point feature sequence into BiLSTM to extract semantic information.The fusion of gesture recognition and lip reading recognition algorithms using the YOLOV5 model can effectively improve the accuracy of gesture recognition.Through experimental verification,the recognition rate can be increased from 89.4%to 94.3%.展开更多
文摘Aiming at people with hearing and speaking obstacles,this paper pro-poses a multialgorithm fusion for gesture recognition.This paper aims to more clearly distinguish easily confused gestures in gesture recognition and improve gesture recognition accuracy by integrating lip-reading recognition.For gesture recognition,this paperfirst performs skin color processing and segmentation on the hand area of the collected video sequence and detects the hand feature points by calling the hand key point model.The extracted gesture features are trained and recognized by the support vector machine algorithm.For lip reading recognition,this paperfirst uses the AdaBoost algorithm to detect and track key points on the collected video sequence,locate the lips,extract the key points of the lips through a convolutional neural network,and input the extracted key point feature sequence into BiLSTM to extract semantic information.The fusion of gesture recognition and lip reading recognition algorithms using the YOLOV5 model can effectively improve the accuracy of gesture recognition.Through experimental verification,the recognition rate can be increased from 89.4%to 94.3%.