期刊文献+

一种基于MASM的口形轮廓特征提取方法及听视觉语音识别 被引量:1

A Lip Contour Extraction Method Based on Multiple Active Shape Model (MASM) for Audio Visual Speech Recognition
下载PDF
导出
摘要 提出了一种用于听视觉语音识别的基于 MASM的口形轮廓提取方法 ,这种方法只需要少量的训练数据就可以实现对大量口形轮廓的准确提取。还引入了一种口形轮廓的平滑修正方法 ,该方法利用口形连续变化的特点 ,对错误轮廓进行修正。实验证明 ,利用该方法提取轮廓的准确率比常规 ASM模型高出 2 0个百分点 ;将该口形轮廓特征引入到听视觉语音识别中 。 In audio visual speech recognition and lipreading, the widely used ASM (Active Shape Model) for lip contour extraction suffers from the lack of robustness and cannot extract the exact lip contours due to the various mouth shape changes when uttering. We present a more robust model——Multiple Active Shape Model (MASM). The model classifies the mouth shapes into closed mouth set, half-opened mouth set, and round mouth set. An independent ASM is built for each different set with a tiny set of the training data. The MASM contour extraction algorithm automatically selects the best accurate lip contour from multiple shape searching procedures. Considering the consecutive changes of the mouth, a method for smoothing lip contours is also presented to correct the contour extraction errors. Experimental results from AVCONDIG database show that extraction accuracy achieved by the MASM is 13% higher than that of conventional ASM. The combination of the MASM and the contour-smoothing method leads to another 7% accuracy improvement. With the fusion of the exact lip contour feature and audio MFCC (Mel Frequency Cepstral Coefficients) feature, the average word recognition rates of the considered connected-digits speech recognition task are considerably increased under noisy acoustic conditions.
出处 《西北工业大学学报》 EI CAS CSCD 北大核心 2004年第5期674-678,共5页 Journal of Northwestern Polytechnical University
基金 中国科技部与比利时弗拉芒大区国际科技合作项目 (国科外 19990 2 0 9号 )资助
关键词 语音识别 听视觉语音识别 ASM MASM 口形轮廓提取 speech recognition, audio visual speech recognition, ASM(Active Shape Model), MASM(Multiple Active Shape Model), lip contour extraction
  • 相关文献

参考文献5

  • 1[1]Summerfield Q. Some Preliminaries to a Comprehensive Account of Audio-Visual Speech Perception. In: Dodd B and Campbell R. Hearing by Eye: The Psychology of Lip-Reading. Hillsdale, USA: Lawrence Erlbaum Associates, 1987,3~51
  • 2[2]McGurk H, McDonald J. Hearing Lips and Seeing Voices. Nature, 1976,2:746~748
  • 3[3]Cootes T F, Taylor C J, et al. Active Shape Models --Their Training and Application. Computer Vision and Image Understanding, 1995, 12(1): 38~59
  • 4[4]Young S J, Kershaw D, Odell J, Woodland P. The HTK Book. http://htk. eng. cam. ac. uk/docs/docs. shtml, 2002
  • 5[5]Bourlard H, Dupone S, Riss C. Multi-Stream Speech Recognition. Technical Report IDIAP-RR96-07, IDIAP, 1996

同被引文献6

  • 1BRAND M.Voice puppetry[C]//Proceedings of ACM SIGGRAPH 1999.Los Angeles:ACM Press,1999:21-28.
  • 2BREGLER C,COVELL M,SLANEY M.Video rewrite:driving visual speech with audio[C]//Proc SIGGRAPH'97.Los Angeles:ACM Press,1997:353-360.
  • 3MOK L L,LAU W H,LEUNG S H,et al.Lip features selection with application to person authentication[C/OL]//2004 IEEE,Volume 3,Issue,17-21 May 2004 Page(s):iii-397-400 vol.3,Montreal,Canada,ICASSP 2004[2006-01-10].http://ieexplore.ieee.org/Xplore/login.jsp?url=/iel5/9248/29345/01326565.pdf.
  • 4COSATTO E,GRAF H P.Sample-based synthesis of photo-realistic talking-heads[C/OL]//Proc Computer Animation,June 1998,pp.103-110,Philadelphia,Pennsylvania,June 8-10,1998[2006-01-10].http://potal.acm.ofr/citation.cfm?id=791528.
  • 5COVELL M.Eigen-points:control-point location using principal component analyses[C/OL]//Proceedings of Conference on Automatic Face and Gesture Recognition,P122-127,Massachusetts,USA,October 1996[2006-01-10].http://ieeexplore.ieee.org/Xplore/login.jsp?url=/ie13/4096/12122/00557253.pdf?arnumber=557253.
  • 6MAHMOODI S,SHARIF B S,CHESTER E G,et al.Bayesian estimation of growth age using shape and texture descriptors[C/OL]//Image Processing and Its Applications,Conference Publication,Volume 2,Issue,1999 Page(s):489-493 vol.2,India,1999[2006-01-10].http://ieeexplore.ieee.org/Xplore/olgin.jsp?url:/ie15/6416/17139/00791096.pdf.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部