模糊语言模型在唇读系统中的应用被引量：1

The Application of Fuzzy Language Model in Lip-reading

下载PDF

导出

摘要论文针对传统的统计语言模型所面临的数据稀疏和估计严苛性问题,提出基于模糊表示的n-元语法模型,并将其应用于唇语识别系统中,结合隐马尔科夫模型(Hidden Markov Model),建立了新的唇动识别模型—HFM(HMM and Fuzzy Language Model)。利用教育部语言文字应用研究所计算语言学研究室研制的语料库在线系统,制作了一个小型语料库,进行了句子识别实验。实验结果表明,HFM可使单音识别率最高提高6.5%,句子识别率最高提高22.7%,另外,采用语言模型对文字流进行解析,而不再是盲目文字匹配,单一视觉流的解析精确度达68.7%。 In this paper,we present a n-gram model based on fuzzy representation,in allusion to the problem of data sparsity and sharply of maximum likelihood estimation that the traditional statistical language model confront. We apply it to the lip reading system,combine with Hidden Markov Model（ HMM）,establish a novel lip movement recognition model HFM（ HMM and Fuzzy Language Model）. A small vocabulary corpus was built by using the corpus online system developed by the Ministry of Education Institute of Applied Linguistics Computational Linguistics Research Laboratory for carrying out sentence recognition experiments. The experimental results demonstrate that HFM（ did not need smoothing） can improve syllable recognition rate by up to 6. 5%,and sentence recognition rate by up to 22. 7%. In addition,using language model for text stream analysis,instead of blindly text matching,analytical accuracy of single visual flow can be up to 68. 7%.

作者荣传振岳振军王渊杨宇

机构地区解放军理工大学通信工程学院

出处《信号处理》 CSCD 北大核心 2015年第10期1301-1306,共6页 Journal of Signal Processing

基金江苏省自然科学基金(bk2012511)资助课题

关键词唇语识别模糊语言模型隐马尔科夫模型语料库 lip-reading fuzzy language model hidden Markov model corpus

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1Jong-Seok Lee. Visual-speech-pass filtering for robust au- tomatic lip-reading [J]. Pattern Analysis and Applica- tions, 2014, 17(3) :611-621.
2Sunil S. Morade, Suprava Patnaik. A novel lip reading algorithm by using localized ACM and HMM: Tested for digit recognition[J]. Optik-International Journal for Lightand Electron Optics, 2014, 125(18) :5181-5186.
3Chuanzhen Rong, Zhenjun Yue. A Novel Feature Selec- tion and Extraction Method for Sequence Images of Lip- reading[ C ] //Advances in Automation and Robotics. 2011:347-353.
4王丹,姚鸿勋,万玉奇,洪晓鹏.唇读中的HLM模型及其文字流解析[J].计算机科学,2008,35(12):171-174. 被引量：1
5肖航.语料库在线[EB/OL].http://www.cncorpus.org/CCindex.aspx,2015.
6孙晓鹏,安丹丹,刘小丹.拼音文本驱动的任意嘴唇曲线的动画生成[J].计算机辅助设计与图形学学报,2008,20(12):1603-1608. 被引量：2
7李皓,陈艳艳,唐朝京.唇部子运动与权重函数表征的汉语动态视位[J].信号处理,2012,28(3):322-328. 被引量：12
8Benjamin Pieart, Thomas Drugman, Thierry Dutoit. Anal- ysis and HMM-based synthesis of hypo and hyperarticulat- ed speech[J]. Computer Speech & Language, 2014, 28 (2) :687-707.
9Yuan Ge, Qigong Chen, Ming Jiang, et al. SCHMM- based modeling and prediction of random delays in net- worked control systems[J]. Journal of Franklin Institute, 2014, 351 (5) :2430-2453.

二级参考文献30

1曹剑芬.普通话双音子和三音子结构系统代表语料集[J].语言文字应用,1997(1):62-70. 被引量：7
2徐向华,朱杰,郭强.汉语连续语音识别中的分级聚类算法的研究和应用[J].信号处理,2004,20(5):497-500. 被引量：2
3黄昌宁,张小凤.自然语言处理技术的三个里程碑[J].外语教学与研究,2002,34(3):180-187. 被引量：20
4王志明,蔡莲红,艾海舟.基于数据驱动方法的汉语文本-可视语音合成(英文)[J].软件学报,2005,16(6):1054-1063. 被引量：16
5张欣,杜利民,陈柯,赵向阳.汉语语音视觉合成研究数据库CVSS1.0[J].微计算机应用,2007,28(3):260-265. 被引量：3
6Bondy M D, Petriu E M, Cordea M D, et al. Model-based face and lip animation for interactive virtual reality applications [C] // Proceedings of the 9th ACM International Conference on Multimedia, Ottawa, 2001:559-563.
7Deng Z G, Bulut M, Neumann U, et al. Automatic dynamic expression synthesis for speech animation [C] //Proceedings of IEEE Computer Animation and Social Agents (CASA), Geneva, 2004:267-274.
8Busso C, Deng Z G, Neumann U, etal. Natural head motion synthesis driven by acoustic prosody features [J]. Computer Animation and Virtual Worlds, 2005, 16(3/4):283-290.
9Costa M, Chen T, Lavagetto F. Visual prosody analysis for realistic motion synthesis of 3D head models [C] // Proceedings of International Conference on Augmented Virtual Environments and 3D Imaging, Mykonos, 2001 :343- 346.
10Zhang S, Wu Z Y, Meng H M, et al. Head movement synthesis based on semantic and prosodic features for a Chinese expressive avatar [C] //Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Honolulu, 2007:837-840.

共引文献11

1曾洪鑫,胡东波,胡志刚.文本与朗读语音共同驱动的汉语语音与口型匹配方案[J].计算机与现代化,2013(10):135-137. 被引量：1
2曾洪鑫,胡东波,胡志刚.浅析汉语语音与口型匹配的基本机理[J].电声技术,2013,37(10):44-48.
3吴翠娟,赵晖.可视化协同发音合成研究综述[J].现代计算机,2014,20(9):9-14.
4曾洪鑫,胡东波,胡志刚.双模态驱动的汉语语音与口型匹配控制模型[J].计算机工程与应用,2015,51(3):202-207. 被引量：1
5米辉辉,侯进,李克豹,甘凌云.汉语语音同步的三维口型动画研究[J].计算机应用研究,2015,32(4):1244-1247. 被引量：3
6米辉辉,侯进,李克豹,甘凌云.虚拟人“双簧”—与语音同步的三维人脸动画的研究[J].计算机应用与软件,2015,32(8):145-149. 被引量：1
7唐郅,侯进.基于深度神经网络的语音驱动发音器官的运动合成[J].自动化学报,2016,42(6):923-930. 被引量：5
8吴志明,侯进,位雪岭.基于运动分解与权重函数的嘴部中文语音动画[J].计算机应用研究,2016,33(12):3858-3862. 被引量：1
9秦添,赵晖.维吾尔语可视语音合成的唇部动画系统[J].计算机工程,2016,42(12):282-289.
10王健海,陈淑环.惠州方言发音的唇位可视技术研究[J].电脑知识与技术（过刊）,2017,23(1X):195-198.

同被引文献16

1Bengio Y, Ducharme R, Vincent P, et al. A neural prob- abilistic language model[ J]. Journal of Machine Learning Research, 2003, 3(2): 1137-1155.
2JeffKuo H K, Ansoy E, Emami A, et al. Large scale hier- archical neural network language models [ C ]//In: Proceed- ings of the 2012 Annual Conference of International Speech Communication Association. Portland, USA: ISCA, 2012: 1672-1675.
3Hai-Son Le, Oparin I, Allauzen A, et al. Structured out- put layer neural network language model [ C ] //IEEE Transactions on Speech and Audio Processing, 2013, 21 ( 1 ) : 195-204.
4Mikolov T, Karafiat M, Burget L, et al. Recurrent neural network based language model [ C ]//In : Proceedings of the 2010 Annual Conference of International Speech Com- munication Association. Makuhari, Chiba, Japan: ISCA, 2010 : 1045-1048.
5Mikolov T, Kombrink S, Burget L, et al. Extensions of recurrent neural network language model [ C ]//In : Pro- ceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic: IEEE, 2011: 5528-5531.
6Hochreiter S, Bengio Y, Frasconi P, et al. Gradient flow in recurrent nets: the difficulty of learning long-term de- pendencies[ M]. 3. Field Guide to Dynamical Recurrent Neural Networks. Piscataway, N.J. IEEE Press, 2001 : 237-243.
7Zen H, Sak H. Unidirectional long short term memory re- current neural network with recuirent output layer for low latency speech synthesis [ C ]//In: Proceedings of the 2015 Annual Conference of International Speech Commu-nication Association. Brisbane, Australia: ISCA, 2015 : 4470-4474.
8Xiang-Gong Li, Xi-Hong Wu. Improving long short-term memory networks using maxout units for large vocabulary speech recognition[ C ]//In: Proceedings of the 2015 Annual Conference of International Speech Conununication Associa- tion. Brisbane, Australia: ISCA, 2015:4600-4604.
9Arisoy E, Sethy A, Ramabhadran B, et al. Bidirectional re- current neural network language models for automatic speech recognition [ C ]//In : Proceedings of the 2015 Annual Con- trence of International Speech Conununication Association. Brisbane, Australia: ISCA, 2015:5421-5425.
10Jian Zhang, Dan Qu, Zhen Li. An improved recurrent neural network language model with context vector fea- tures[ C]//In: Proceedings of the 2014 IEEE Interna- tional Conference on Software Engineering and Service Science. Beijing, China: IEEE, 2014:828-831.

引证文献1

1李华,屈丹,张文林,王炳锡,梁玉龙.结合全局词向量特征的循环神经网络语言模型[J].信号处理,2016,32(6):715-723. 被引量：9

二级引证文献9

1吴旭康,杨旭光,陈园园,王营冠,张阅川.主题联合词向量模型[J].计算机工程,2018,44(2):233-237. 被引量：6
2陈波.基于循环结构的卷积神经网络文本分类方法[J].重庆邮电大学学报（自然科学版）,2018,30(5):705-710. 被引量：14
3马健钦.面向应用性能管理系统的运行负载预测[J].计算机测量与控制,2018,26(11):208-212. 被引量：1
4邱意,贾桂敏,杨金锋,刘远庆.民航陆空通话语音识别BiLSTM网络模型[J].信号处理,2019,35(2):293-300. 被引量：7
5李毓,杨雅婷,李晓,米成刚,董瑞.面向汉维机器翻译的神经网络语言模型[J].厦门大学学报（自然科学版）,2019,58(2):189-194. 被引量：3
6冯亚琴,沈凌洁,胡婷婷,王蔚.利用语音与文本特征融合改善语音情感识别[J].数据采集与处理,2019,34(4):625-631. 被引量：3
7殷晓雨,阿力木江·艾沙,库尔班·吾布力.基于卷积递归模型的文本分类研究[J].电子技术应用,2019,45(10):29-32. 被引量：2
8董文伟,林举,解焱陆.声韵母约束扩展识别网络的发音偏误检测[J].信号处理,2020,36(6):977-983. 被引量：4
9赵永良,付鑫,郭阳,边迎迎,王思宁.基于深度学习和图像识别的电力配件智能出入库[J].中国电力,2021,54(3):55-60. 被引量：11

1荣传振,岳振军,贾永兴,王渊,杨宇.唇语识别关键技术研究进展[J].数据采集与处理,2012,27(S2):277-283. 被引量：4
2任玉强,田国栋,周祥东,吕江靖,周曦.高安全性人脸识别系统中的唇语识别算法研究[J].计算机应用研究,2017,34(4):1221-1225. 被引量：18
3Read My Lips唇语识别[J].国外科技动态,2004(6):40-40.
4吕品轩,王士林,李生红.基于唇语识别的特征鉴别力分析[J].信息安全与通信保密,2008,30(5):60-62. 被引量：5
5王丹,姚鸿勋,万玉奇,洪晓鹏.唇读中的HLM模型及其文字流解析[J].计算机科学,2008,35(12):171-174. 被引量：1
6肖庆阳,张金,左闯,范娟婷,梁碧玮,邸硕临.基于语义约束的口型序列识别方法[J].计算机应用与软件,2012,29(9):226-229.
7李存,张凌浩.视觉流和操作流理论在移动设备界面中的交互设计研究[J].决策与信息,2014(21):122-122.
8鹿佳,姚鸿勋.改进AdaBoost对基于HMM的唇读系统识别率的提高[J].哈尔滨商业大学学报（自然科学版）,2005,21(5):604-607.
9新发现与新技术[J].电子测试,2007(4):104-104.
10龚千军.基于网络爬虫的多媒体课件下载系统设计与实现[J].电脑编程技巧与维护,2016(9):70-71.

信号处理

2015年第10期

浏览历史

内容加载中请稍等...

模糊语言模型在唇读系统中的应用被引量：1

参考文献9

二级参考文献30

共引文献11

同被引文献16

引证文献1

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

模糊语言模型在唇读系统中的应用 被引量：1

参考文献9

二级参考文献30

共引文献11

同被引文献16

引证文献1

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

模糊语言模型在唇读系统中的应用被引量：1