期刊文献+

利用汉语语音音素帧间相关性的唇形特征识别 被引量:1

Lip Contour Recognition Based on Context Information of Chinese Phoneme
下载PDF
导出
摘要 为了进一步提高针对汉语语音的唇形特征识别效果,分析实际汉语语音发音过程中声母韵母之间音素的变换规律,以及连读等发音习惯而造成的口形变化,利用唇形特征所对应的音素帧间的相关性,采用二阶隐马尔可夫模型来对唇形特征参数序列进行学习和识别,从而分析汉语唇形识别效果.基于独立汉字发音的实验表明,在针对特定人的识别条件下,在最优的加权因子(m∶n=1.5∶1)特征组合条件下,针对同一组融合得到的特征向量,考虑了音素帧间的相关性后,识别率提高了1.2%.可见汉语音节中音素帧间的相关性与唇形特征的变化规律相对应,有利于提高唇形识别的效果. In order to improve the recognition rate of lipreading for Chinese phoneme. The context information of Chinese phoneme is considered. Second-order Hidden Markov Model is implemented to train and test the lip' s feature sequences to capture the changing discipline between consonant and vowel in Chinese phoneme. The accuracy of recognition rates are tested with the same lip feature vectors. The experimental results based on isolated Chinese words show that the context information of Chinese phoneme can produce better recognition result when applied to lipreading. A maximum recognition rate was improved by 1.2% under the best weighted coefficients (m : n = 1.5 : 1). It can see that the changing discipline of lip feature vectors fits for the context information of Chinese phoneme, which can produce better recognition result of lipreading.
出处 《河北工业大学学报》 CAS 北大核心 2010年第3期37-41,共5页 Journal of Hebei University of Technology
基金 国家自然科学基金(60674111) 天津大学985工程资助项目
关键词 唇形识别 音素帧间相关性 加权组合特征向量 二阶隐马尔可夫模型 lip contour recognition context information weighted feature second-order Hidden Markov Model
  • 相关文献

参考文献10

  • 1Kaynak M N,Qi ZH,Cheok A D,et al.Audio-visual modeling for bimodal speech recognition[A].Proceedings of IEEE International Conference on Systems,Man,and Cybernetics[C].Tucson,Arizona,USA:IEEE,2001,1:181-186.
  • 2Kumatani K,Stiefelhagen R.State Synchronous Modeling on Phone Boundary for Audio Visual Speech Recognition and Application to Muti-ViewFace Images[A].Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing[C].Honolulu,Hawaii,USA:IEEE,2007,4,417-420.
  • 3Kumar K,Tsuhan Chen,Stern R M.Profile View Lip Reading[A].Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing[C].Honolulu,Hawaii,USA:IEEE,2007,4,429-432.
  • 4Patel I,Rao S.Automated speech synthesize and converter in cue symbol generation for hearing impaired[J].International Journal of Recent Trends in Engineering,2009,2(7):108-113.
  • 5Meng Li,Yiu-ming Cheung.A Novel Motion Based Lip Feature Extraction for Lip-Reading[A].Proceeding of 2008 International Conference on Computational Intelligence and Security[C].SuZhou,China,IEEE-CS,2008,1,361-365.
  • 6王志明,蔡莲红,吴志勇,陶建华.汉语文本-可视语音转换的研究[J].小型微型计算机系统,2002,23(4):474-477. 被引量:9
  • 7李刚,王蒙军,林凌.面向残疾人的汉语可视语音数据库[J].中国生物医学工程学报,2007,26(3):355-360. 被引量:3
  • 8Othman H,Aboulnasr T A.simplified second-order HMM with application to face recognition[A].Proceedings of IEEE International Symposium on Circuits and Systems[C].USA:IEEE,2001,2:161-164.
  • 9Kundu A,He Y,Bahl P.Recognition of handwritten word:first and second order hidden Markov model based approach[A].Proceedings of Computer Society Conference on Computer Vision and Pattern Recognition[C].Los Alamitos,USA:IEEE-CS 1988,Pages:57-462.
  • 10王蒙军,李刚,林凌,曾锐利.唇动图像序列的加权组合特征分析[J].光学精密工程,2008,16(3):511-517. 被引量:2

二级参考文献32

  • 1车翔.全喉切除发音重建术的研究进展[J].湖北省卫生职工医学院学报,2000,13(3):50-51. 被引量:1
  • 2洪晓鹏,姚鸿勋,徐铭辉.基于句子级的唇读语料库及其切分算法[J].计算机工程与应用,2005,41(3):174-177. 被引量:7
  • 3徐铭辉,姚鸿勋.基于句子级的唇语识别技术[J].计算机工程与应用,2005,41(8):86-88. 被引量:3
  • 4李采,周梁,蒋家琪.电子喉研究进展[J].国外医学(耳鼻咽喉科学分册),2005,29(5):295-297. 被引量:5
  • 5李刚,王蒙军,林凌.采用非对称唇形轮廓模型提高汉语唇形识别效果[J].光学精密工程,2006,14(3):473-477. 被引量:5
  • 6王志明 蔡莲红.汉语音节与口形关系的研究.第九届全国多媒体技术学术会议(NCMT'2000)[M].北京,2000..
  • 7[17]RABINER L R.A tutorial on hidden Markov model and selected application in speech recognition[J].IEEE,1989,77(2):257-286.
  • 8[18]WILLIAMS J J,KATSAGGELOS A K,RANDOLPH M A.A hidden Markov model based visual speech synthesizerEC].Proceedings of International Conference on Acoustics,Speech,and Signal Processing,Piscataway,Nf,USA:fEEE,2000,4:2393-2396.
  • 9[1]WANG R,YAO H X,GAO W.Recognition of sequence lip images and its application[C].Proceedings of IEEE Fourth International Conference on Signal Processing,Beijing,China,1998,1:849-854.
  • 10[2]ZHANG X,MERSEREAU R M,CLEMENTS M,et al..Visual speech feature extraction for improved speech recognition[c].Proceedings of IEEE International Conference on Acoustics,Speech,and Signal Processing,Pis-cataway,NJ,USA:IEEE,2002,2:1993-1996.

共引文献11

同被引文献4

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部