一种快速嘴部检测方法在视听语音识别的应用

Fast Mouth Detection Approach Applied in Audio-Visual Speech Recognition

下载PDF

导出

摘要在改进噪音环境下的语音识别率中,来自于说话人嘴部的可视化语音信息有着显著的作用。介绍了在视听语音识别(AVSR)中的重要组成部分之一:可视化信息的前端设计;描述了一种用于快速处理图像并能达到较高识别率的人脸嘴部检测的机器学习方法,此方法引入了旋转Harr-like特征在积分图像中的应用,在基于AdaBoost学习算法上通过使用单值分类作为基础特征分类器,以级联的方式合并强分类器,最后划分检测区域用于嘴部定位。将上述方法应用于AVSR系统中,基本上达到了对人脸嘴部实时准确的检测效果。 The visual information comes from speaker＇s mouth had proved very useful in improving speech recognition, especially in noise environment. In this paper, first introduced one of the main components in audio-visual speech recognition system： visual front end design then proved a machine learning method for mouth region detection which could rapidly process image with high detection rates. This approach includes the introduction of rotated Harr-like feature in integral image, a learning algorithm based on Adaboost with sign value trees as base classifiers, combination of complex classifiers in cascade and regionalization of the face area. At the end, applied this scheme in AVSR system yield high detection rates which may reaches basically real time requirement.

作者刘家涛陈一民

机构地区上海大学计算机科学与工程学院

出处《计算机技术与发展》 2008年第10期16-19,共4页 Computer Technology and Development

基金上海市科技基金资助项目(7A07094)

关键词模态视听语音识别 Harr-like特征重要区域积分图像区域划分 modality audio - visual speech recognition Harr - like feature region of interest integral image regionalization

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1Gong. Speech recognition in noisy environments: a survey[ J ]. Speech Communication, 1995,16: 261 - 291.
2Potamianos G, Luettin J. Audio- visual speech recognition [ R]. Final Workshop 2000 Report, Center for Language and Speech Processing. Baltimore, MD: The Johns Hopkins University,2000.
3Liang H, Liu X X,Zhao Y B,et al. Speaker independent audio - visual continuous speech recognition[ C]//In Proc. of IEEE ICME. Lausanne, Switzerland: [ s. n. ] ,2002.
4Viola P, Jones M J. Rapid Object Detection using a Boosted Cascade of Simple Features[J]. IEEE CVPR,2001 (1) :511 - 518.
5Papageorgiou C,Oren M, Poggio T.A general framework for Object Detection[ C]//In International Conference on Com- puter Vision. [s. l. ]:[s. n. ] ,1998.
6Freund Y,Schapire R E. A decision-theoretic generalization of on-llne learning and an application to boosting[C]//In Computational Learning Theory: Eurocolt ' 95. [ s. l. ] : Springer - Verlag, 1995:23 - 37.
7Amit Y,Geman D,Wilder K. Joint induction of shape features and tree classifiers[J]. IEEE Transactions on Pattern Ananlysis and Machine Intelligence, 1997,19( 11 ) : 1300 - 1305.
8Cristinacce D, Cootes T. Facial feature detection using AdaBoost with shape constraints[ C ]//British Machine Vision Conference. [ s. l. ] : [ s. n. ] ,2003.

1王玉芹.基于数据流模型的网络异常检测方法研究[J].潍坊学院学报,2006,6(4):21-23. 被引量：3
2潘海兵.基于基础特征的P2P流量识别技术研究[J].科技创新导报,2009,6(18):32-32.
3梅晓丹,张毅刚,孙圣和.模糊神经网络语音数据融合算法的研究[J].控制与决策,2003,18(2):213-216. 被引量：2
4崔成,田启川.基于Adaboost算法的快速人脸检测技术[J].信息工程期刊（中英文版）,2015,5(3):90-96. 被引量：1
5何光辉,张太平.局部匹配的人脸识别方法[J].重庆大学学报（自然科学版）,2012,35(12):133-138. 被引量：3
6吕茂成,刘群芳.关于噪声环境下遗传算法的改进[J].通讯世界（下半月）,2016,0(1):148-148.
7曹辉,曹礼刚,简兴祥.基于神经网络融合的语音人脸身份识别方法[J].计算机工程,2007,33(11):184-186. 被引量：4
8周贤娟,赵发,冷强,杨欢.具有语音识别功能的无线传感器网络节点设计[J].单片机与嵌入式系统应用,2014,14(7):57-59.
9阮锦新,尹俊勋.基于人脸特征和AdaBoost算法的多姿态人脸检测[J].计算机应用,2010,30(4):967-970. 被引量：23
10王昆,李凌均,周喜格.支持向量数据描述和经验模态分解相结合的故障诊断[J].机械强度,2009,31(6):1012-1014. 被引量：1

计算机技术与发展

2008年第10期

浏览历史

内容加载中请稍等...

一种快速嘴部检测方法在视听语音识别的应用

参考文献8

相关作者

相关机构

相关主题

浏览历史