期刊文献+

唇读中基于像素的特征提取方法的研究 被引量:3

Research of pixel based feature extraction in lip-reading
下载PDF
导出
摘要 针对单独视觉通道唇读中的基于像素的特征提取问题,提出一个级联的特征提取策略。首先对图像采用相应的变换,然后对变换结果降维,最后进行特征归一化。基于对几种变换方法的比较与分析,提出利用PCA对DCT和Gabor小波变换结果降维的DCT-PCA和Gabor-PCA方法,与传统人工选择变换系数的方法相比识别率提高了约10%。 This paper concentrates on the pixel based feature extraction in only visual channel lip-reading system.A three-stage cascade visual front end is proposed.The first stage is corresponding transform to be performed over the image,the second stage is to reduce the dimensions of the transformed image,in the third stage all feature vectors are normalized into a uniform scale. We apply PCA to reduce the dimension of DCT and Gabor transformed data called DCT-PCA and Gabor-PCA,which can improve the recognition accuracy by 10% compared with the manually-selected features.
出处 《计算机工程与应用》 CSCD 北大核心 2007年第20期197-199,221,共4页 Computer Engineering and Applications
基金 新世纪优秀人才支持计划(No.NCET-05-0334) 黑龙江省自然科学基金(the Natural Science Foundation of Heilongjiang Province of China under Grant No.E2005-09)
关键词 唇读 特征提取 PCA DCT GABOR lip-reading feature extraction PCA DCT Gabor
  • 相关文献

参考文献14

  • 1姚鸿勋,高文,王瑞,郎咸波.视觉语言——唇读综述[J].电子学报,2001,29(2):239-246. 被引量:30
  • 2Potamianos G.A cascade image transform for speaker independent automatic speechreading[C]//IEEE International Conference on Multimedia and Expo,2:1097-1100.
  • 3Potamianos G,Graf H P,Cosatto E.An image transform approach for HMM based automatic lipreading[C]//Proc Int Conf Image Process,Chicago,1998:173-177.
  • 4Scanlon P,Reilly R.Visual feature analysis for automatic speechreading[C]//Proc Works Multimedia Signal Processing,Cannes,France,Oct 3-5,2001,2001:625-630.
  • 5Matthews et al.Extraction of visual features for lipreading[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,2002,24(2).
  • 6Bregler C,Konig Y.Eigenlips for robust speech recognition[C]//Proc Int Conf Acoust Speech Signal Process,Adelaide,1994:669-672.
  • 7Duchnowski P.Toward movement-invariant automatic lipreading and speech recognition[C]//Proc Int Conf Acoust Speech Signal Process,Detroit,1995:109-112.
  • 8Neti C.Audio-visual speech recognition,Final Summer 2000 Work Shop Report[R].Center for Language and Speech Processing,Baltimore,2000.
  • 9Heckmann M.DCT-based video features for audio-visual speech recognition[C]//Proc Int Conf Spoken Lang Process Denver,USA,September 2002,2002:1925-1928.
  • 10Liu C,Wechsler H.Gabor feature based classification using enhanced fisher linear discriminant model for face recognition[J].IEEE Trans Image Processing,2002,11 (4):467-476.

二级参考文献13

  • 1王瑞.连续语音唇读识别的研究.哈尔滨工业大学计算机系博士论文开题报告[M].哈尔滨工业大学档案馆,1998..
  • 2徐彦君.中文双语料语音识别关键技术研究:博士论文[M].北京:中科院语音所,1998..
  • 3间濑健二.读唇[J].电子情报通信学会论文志,1990,73(6):796-803.
  • 4Yao H,IEEE Fourth Int Conference on Signal Processing,1998年,912页
  • 5徐彦君,博士学位论文,1998年
  • 6王瑞,博士论文开题报告,1998年
  • 7Liu M B,计算机学报,1998年,21卷,6期,527页
  • 8Li N,http://www.cs.ucf.edu/~vision/papers/shah/97/NDS97 pdf,1997年
  • 9Chiou G I,IEEE Trans Image Processing,1997年,6卷,8期,1192页
  • 10Dai Y,Pattern Recognition,1996年,29卷,6期,1007页

共引文献29

同被引文献27

  • 1HONG Xiaopeng,YAO Hongxun,WAN Yuqi,et al. A PCA based visual DCT feature extraction method for lip-reading [EB/OL].[2010- 04 -20].http ://www.computer.org/portal/web/csdl/doi/10.1109/IIH -MSP. 2006.35.
  • 2HE Jun,ZHANG Hua. Research on visual speech feature extraction [EB/OL].[2010-04-20].http ://portal.acm.org/citation.cfm?id= 1511096.
  • 3MATI'HEWS I,COOTES T F,BANGHAM J A,et al. Extraction of visual features for lipreading [J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24 (2) : 198-213.
  • 4POTAMIANOS G,NETI C,GRAVIER G,et al. Recent advances in the automatic recognition of audio-visual speech[J].Proceedings of the IEEE ,2003,91 (9) : 1306-1326.
  • 5HECKMANN M. DCT-based video features for audio-visual speech recognition [EB/OL].[2010-04-20].http ://www.icp.inpg.fr/-bertho/ref/ heckmann-icslp02.pdf.
  • 6KINGSBURY N G. The dual-tree complex wavelet transform: a new technique for shift invariance and directional filters [EB/OL].[2010- 04-20].http ://publications.eng.cam.ac.uk/12273/.
  • 7KINGSBURY N G. Shift invariant properties of the dual-tree complex wavelet transform[EB/OL].[2010-04-20].http ://portal.acm.org/eitation. cfm?id= 1257691.
  • 8SELESNICK I W,BARANIUK R G,KINGSBURY N G. The dual- tree complex wavelet transform [EB/OL]. [2010-04-20].http://en. wikipedia.org/wiki/Complex_wavelet_transform.
  • 9Hong Xiao-peng, Yao Hong xun, Wan Yu-qi, et al. A PCA based Visual Feature Extraction Method for Lip-Reading[C]//Proceeding of the 2006 international Conference on intelligent information Hiding and Multimedia Signal Proeessing(IIH-MSW 06).
  • 10He Jun, Zhang Hua. Research on Visual Speech Feature Extraction[C]//2009 international Conference on Computer Engineering and Technology. 2009:499-502.

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部