期刊文献+

一种基于模式发现的语种识别方法

Language Recognition Based on Pattern Discovery
下载PDF
导出
摘要 针对不同语种中"语音模式"搭配关系不同的特点,提出一种基于"语音模式"发现的语种识别方法。首先采用无标注语音数据训练GMM模型,实现"语音模式"发现,获得每一帧语音在各"语音模式"下的后验概率,确定"语音模式"的边界;然后采用n-gram的方法统计每段语音中"语音模式"的搭配关系,并以"语音模式"的联合概率描述搭配关系;最后以SVM为分类器实现语种识别。实验的测试语料为NIST2003和NIST2007,针对英语、日语、汉语3个语种进行实验,结果表明在语音时长分别为3 s、10 s、30 s时的等错误率分别可达到0.14%、0.14%、0.49%。 In this paper a language recognition method based on speech pattern discovery is pro- posed, aiming to use the context information of speech patterns. Firstly, a Gaussian mixture model (GMM) is trained using unlabeled data to find the speech patterns. Then the posterior probability of each speech pattern is obtained and used to determine the segment. Secondly, the method of u-gram is used to get speech patterns' matching relations of each speech pattern and the joint posterior prob- ability is used to describe the relations. Finally, language recognition is done with a support vector machine (SVM). The method is tested on NIST2003 and NIST2007 in English, Japanese and Mandarin three languages. Experiments show that the equal error rate reaches 0.14% ,0.14% ,0.49% in 3 s,10 s, 30 s.
作者 关娜娜 张连海 GUAN Nana, ZHANG Lianhai(Information Engineering University, Zhengzhou 450001 , China)
机构地区 信息工程大学
出处 《信息工程大学学报》 2018年第1期52-56,共5页 Journal of Information Engineering University
基金 中电54所项目
关键词 语种识别 无监督 模式发现 n-grm language recognition unsupervised pattern discovery n-gram
  • 相关文献

参考文献1

二级参考文献6

  • 1[1]Y K Muthusamy,E Barnard,R A Cole. Reviewing Automatic Language Identification[J].IEEE Signal Processing Magazine,1994-10
  • 2[2]M A Zissman. Comparison of four approaches to automatic language identification of telephone speech[J].IEEE Trans Speech Audio Processing, 1996 ;4: 31~44
  • 3[3]D A Reynolds,R C Rose. Rosust text-independence speaker identification using Gaussian mixture speaker models[J].IEEE Trans Speech Audio Processing, 1995 ;3( 1 ) :72~83
  • 4[4]W H Tsai,W W Chang. Discriminative training of Gaussian mixture bigram models with applications to Chinese dialect identification[J].Speech Communication, 2002; 36: 317~326
  • 5[5]B H Juang,W Chou,C H Lee. Minimum classification error rate methods for speech recognition[J].IEEE Trans Speech Audio Processing,1997; 5: 257~265
  • 6[6]Y K Muthusamy,R A Cole,B T Oshika. The OGI Multi-language telephone speech corpus[R].Technical report,Center for Spoken Language Understanding Oregon Graduate Institute of Science and Technology, Portland, 1993

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部