期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Developing phoneme-based lip-reading sentences system for silent speech recognition
1
作者 Randa El-Bialy Daqing Chen +4 位作者 Souheil Fenghour Walid Hussein Perry Xiao Omar HKaram Bo Li 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第1期129-138,共10页
Lip-reading is a process of interpreting speech by visually analysing lip movements.Recent research in this area has shifted from simple word recognition to lip-reading sentences in the wild.This paper attempts to use... Lip-reading is a process of interpreting speech by visually analysing lip movements.Recent research in this area has shifted from simple word recognition to lip-reading sentences in the wild.This paper attempts to use phonemes as a classification schema for lip-reading sentences to explore an alternative schema and to enhance system performance.Different classification schemas have been investigated,including characterbased and visemes-based schemas.The visual front-end model of the system consists of a Spatial-Temporal(3D)convolution followed by a 2D ResNet.Transformers utilise multi-headed attention for phoneme recognition models.For the language model,a Recurrent Neural Network is used.The performance of the proposed system has been testified with the BBC Lip Reading Sentences 2(LRS2)benchmark dataset.Compared with the state-of-the-art approaches in lip-reading sentences,the proposed system has demonstrated an improved performance by a 10%lower word error rate on average under varying illumination ratios. 展开更多
关键词 deep learning deep neural networks lip-reading phoneme-based lip-reading spatial-temporal convolution transformers
下载PDF
Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning
2
作者 Nada Faisal Aljohani Emad Sami Jaha 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3037-3058,共22页
The continuing advances in deep learning have paved the way for several challenging ideas.One such idea is visual lip-reading,which has recently drawn many research interests.Lip-reading,often referred to as visual sp... The continuing advances in deep learning have paved the way for several challenging ideas.One such idea is visual lip-reading,which has recently drawn many research interests.Lip-reading,often referred to as visual speech recognition,is the ability to understand and predict spoken speech based solely on lip movements without using sounds.Due to the lack of research studies on visual speech recognition for the Arabic language in general,and its absence in the Quranic research,this research aims to fill this gap.This paper introduces a new publicly available Arabic lip-reading dataset containing 10490 videos captured from multiple viewpoints and comprising data samples at the letter level(i.e.,single letters(single alphabets)and Quranic disjoined letters)and in the word level based on the content and context of the book Al-Qaida Al-Noorania.This research uses visual speech recognition to recognize spoken Arabic letters(Arabic alphabets),Quranic disjoined letters,and Quranic words,mainly phonetic as they are recited in the Holy Quran according to Quranic study aid entitled Al-Qaida Al-Noorania.This study could further validate the correctness of pronunciation and,subsequently,assist people in correctly reciting Quran.Furthermore,a detailed description of the created dataset and its construction methodology is provided.This new dataset is used to train an effective pre-trained deep learning CNN model throughout transfer learning for lip-reading,achieving the accuracies of 83.3%,80.5%,and 77.5%on words,disjoined letters,and single letters,respectively,where an extended analysis of the results is provided.Finally,the experimental outcomes,different research aspects,and dataset collection consistency and challenges are discussed and concluded with several new promising trends for future work. 展开更多
关键词 Visual speech recognition lip-reading deep learning quranic Arabic dataset Tajwid
下载PDF
HLR-Net: A Hybrid Lip-Reading Model Based on Deep Convolutional Neural Networks 被引量:2
3
作者 Amany M.Sarhan Nada M.Elshennawy Dina M.Ibrahim 《Computers, Materials & Continua》 SCIE EI 2021年第8期1531-1549,共19页
Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking.This is a task of decoding the text from the speaker’s mouth movement.This paper proposes a lip-reading mode... Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking.This is a task of decoding the text from the speaker’s mouth movement.This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles.Using deep learning technologies makes it easier for users to extract a large number of different features,which can then be converted to probabilities of letters to obtain accurate results.Recently proposed methods for lip reading are based on sequence-to-sequence architectures that are designed for natural machine translation and audio speech recognition.However,in this paper,a deep convolutional neural network model called the hybrid lip-reading(HLR-Net)model is developed for lip reading from a video.The proposed model includes three stages,namely,preprocessing,encoder,and decoder stages,which produce the output subtitle.The inception,gradient,and bidirectional GRU layers are used to build the encoder,and the attention,fully-connected,activation function layers are used to build the decoder,which performs the connectionist temporal classification(CTC).In comparison with the three recent models,namely,the LipNet model,the lip-reading model with cascaded attention(LCANet),and attention-CTC(A-ACA)model,on the GRID corpus dataset,the proposed HLR-Net model can achieve significant improvements,achieving the CER of 4.9%,WER of 9.7%,and Bleu score of 92%in the case of unseen speakers,and the CER of 1.4%,WER of 3.3%,and Bleu score of 99%in the case of overlapped speakers. 展开更多
关键词 lip-reading visual speech recognition deep neural network connectionist temporal classification
下载PDF
A Study on the Impact of Voice-to-Text Technology on Academic Achievement of the Hearing-Impaired
4
作者 Zhe Wang 《Journal of Contemporary Educational Research》 2024年第8期276-282,共7页
Hearing loss is a significant barrier to academic achievement,with hearing-impaired(HI)individuals often facing challenges in speech recognition,language development,and social interactions.Lip-reading,a crucial skill... Hearing loss is a significant barrier to academic achievement,with hearing-impaired(HI)individuals often facing challenges in speech recognition,language development,and social interactions.Lip-reading,a crucial skill for HI individuals,is essential for effective communication and learning.However,the COVID-19 pandemic has exacerbated the challenges faced by HI individuals,with the face masks hindering lip-reading.This literature review explores the relationship between hearing loss and academic achievement,highlighting the importance of lip-reading and the potential of artificial intelligence(AI)techniques in mitigating these challenges.The introduction of Voice-to-Text(VtT)technology,which provides real-time text captions,can significantly improve speech recognition and academic performance for HI students.AI models,such as Hidden Markov models and Transformer models,can enhance the accuracy and robustness of VtT technology in diverse educational settings.Furthermore,VtT technology can facilitate better teacher-student interactions,provide transcripts of lectures and classroom discussions,and bridge the gap in standardized testing performance between HI and hearing students.While challenges and limitations exist,the successful implementation of VtT technology can promote inclusive education and enhance academic achievement.Future research directions include popularizing VtT technology,addressing technological barriers,and customizing VtT systems to cater to individual needs. 展开更多
关键词 lip-reading HEARING-IMPAIRED Voice-to-text Academic achievement Hidden Markov models Transformer models Inclusive education
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部