摘要
针对不同语种中"语音模式"搭配关系不同的特点,提出一种基于"语音模式"发现的语种识别方法。首先采用无标注语音数据训练GMM模型,实现"语音模式"发现,获得每一帧语音在各"语音模式"下的后验概率,确定"语音模式"的边界;然后采用n-gram的方法统计每段语音中"语音模式"的搭配关系,并以"语音模式"的联合概率描述搭配关系;最后以SVM为分类器实现语种识别。实验的测试语料为NIST2003和NIST2007,针对英语、日语、汉语3个语种进行实验,结果表明在语音时长分别为3 s、10 s、30 s时的等错误率分别可达到0.14%、0.14%、0.49%。
In this paper a language recognition method based on speech pattern discovery is pro- posed, aiming to use the context information of speech patterns. Firstly, a Gaussian mixture model (GMM) is trained using unlabeled data to find the speech patterns. Then the posterior probability of each speech pattern is obtained and used to determine the segment. Secondly, the method of u-gram is used to get speech patterns' matching relations of each speech pattern and the joint posterior prob- ability is used to describe the relations. Finally, language recognition is done with a support vector machine (SVM). The method is tested on NIST2003 and NIST2007 in English, Japanese and Mandarin three languages. Experiments show that the equal error rate reaches 0.14% ,0.14% ,0.49% in 3 s,10 s, 30 s.
作者
关娜娜
张连海
GUAN Nana, ZHANG Lianhai(Information Engineering University, Zhengzhou 450001 , China)
出处
《信息工程大学学报》
2018年第1期52-56,共5页
Journal of Information Engineering University
基金
中电54所项目
关键词
语种识别
无监督
模式发现
n-grm
language recognition
unsupervised
pattern discovery
n-gram