基于高斯混合模型移动因子补偿的说话人识别方法被引量：2

Gaussian mixture model compensation method using shift factor for speaker recognition

下载PDF

导出

摘要提出一种模型补偿方法,以克服基于高斯混合模型的文本无关说话人识别系统性能随目标话者训练语料长度减小而下降的问题。该方法首先构造了一个低维的移动空间,每个训练语料较充分说话人模型的自适应过程均可用该空间中的移动因子表示,然后在目标话者训练语料较不充分的条件下,从受训练语料长度影响较小的话者模型分量中学习移动因子,并依据它对受语料长度影响较大的分量进行参数补偿。和基线系统相比,该方法在相同的训练和评测集上,等错误率指标下,获得相对约7%的性能提升。 The performance of GMM-based text-independent speaker recognition systems declines rapidly when the training data is reduced. A model compensation method is proposed to address the problem. Since there is a shift between each target GMM-based model and the UBM （Universal Background Model）, a low-dimensional affine space is fined, named shift space, and the shift for each model with sufficient training data is transformed to the shift factor in this space. When the training data of the target speaker is insufficient, firstly, the coordinate of the shift factor is learned from the GMM mixtures of insensitive to the amount of training data, and then it is adopted to compensate other GMM mixtures. Using the proposed method, a relative reduction of 7% in EER （equal error rate） is obtained comparing with the baseline system.

作者姜涛韩纪庆郑铁然

机构地区哈尔滨工业大学计算机科学与技术学院

出处《声学学报》 EI CSCD 北大核心 2011年第6期658-664,共7页 Acta Acustica

基金 973计划项目(2007CB311100) 863计划重点项目(2006AA010103)资助

关键词说话人识别系统高斯混合模型补偿方法移动识别方法因子自适应过程文本无关 Character recognition

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献18

1Kinnunen T, Li H Z. An overview of text-independent speaker recognition: from features to supervectors. Speech Communication, 2010; 52(1): 12--40.
2包永强,赵力,邹采荣.采用归一化补偿变换的与文本无关的说话人识别[J].声学学报,2006,31(1):55-60. 被引量：13
3俞一彪,袁冬梅,薛峰.一种适于说话人识别的非线性频率尺度变换[J].声学学报,2008,33(5):450-455. 被引量：12
4Reynolds D A, Rose R C. Robust text-independent speaker identification using gaussian mixture models. IEEE Transactions on Speech and Audio Processing, 1995; 3(1): 72- 83.
5Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000; 10:19-41.
6Sachin S K, Nicolas S, Martin G, Elizabeth S, Andreas S, Luciana F, Tobias B. The SRI NIST 2008 speaker recognition evaluation system. Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009:4205 4208.
7Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006; 13(5): 308--311.
8Bilmes J A. A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Technical Report 1CSI-TR- 97-021, 1997.
9Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech and Language Processing, 2007; 15(4): 1435--1447.
10Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 2007; 15(4): 1448--1460.

二级参考文献42

1芮贤义,俞一彪.基于小波变换的鲁棒型特征提取及说话人识别[J].电路与系统学报,2005,10(5):129-132. 被引量：7
2俞一彪,王朔中.文本无关说话人识别的全特征矢量集模型及互信息评估方法[J].声学学报,2005,30(6):536-541. 被引量：7
3包永强,赵力,邹采荣.采用归一化补偿变换的与文本无关的说话人识别[J].声学学报,2006,31(1):55-60. 被引量：13
4张玲华,郑宝玉,杨震.基于语音谐波结构的鲁棒特征参数及其在说话人识别中的应用[J].电子与信息学报,2006,28(10):1786-1789. 被引量：3
5芮贤义,俞一彪.噪声环境下说话人识别的组合特征提取方法[J].信号处理,2006,22(5):673-677. 被引量：12
6NIST. The NIST Year 2006 Speaker Recognition Evaluation Plan[ OL]. http://www. nist. gov/speech /tests/spk/2006/ sre-06_ evalplan-v9. pdf.
7C Vair, D Colibro, F Castaldo, et al. Loquendo-politecnico di Torino' s 2006 NIST speaker recognition evaluation system [ A]. Proceedings of INTERSPEECH [ C ]. Antwerp, Belgium, 2007.1238 - 1241.
8A Pretil,J F Bonastrel,D Matrouf. Confidence measure based unsupervised target model adaptation for speaker verification [A]. Proceedings of INTERSPEECH [ C] .Antwerp, Belgium, 2007.754 - 757.
9P Mat ejka, L Burget, P Schwarz, O Glembek, et al. STBU system for the NIST 2006 speaker recognition evaluation[ A]. Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP) [ C ]. Honolulu, Hawaii, USA, 2007,4:221 - 224.
10D A Reynolds, T F Quatieri, R B Dunn. Speaker verification using adapted Gaussian mixture models[ J]. Digital Signal Processing, 2000,10( 1 - 3) : 19 - 41.

共引文献36

1徐利敏,唐振民,何可可,钱博.基于加权特征补偿变换的说话人识别仿真研究[J].系统仿真学报,2008,20(3):616-619. 被引量：1
2武永星,郑海,周波,杨常青,李茂林.基于距离和相关性准则的混合参数说话人识别[J].系统仿真学报,2008,20(4):926-930.
3陈妮,盛利元,肖小清,袁益民.基于自适应补偿的文本无关说话人识别[J].计算机仿真,2008,25(6):277-280.
4徐利敏,唐振民,何可可,钱博.基于自适应直方图均衡化的鲁棒性说话人辨认研究[J].自动化学报,2008,34(7):752-759. 被引量：5
5钱博,唐振民,李燕萍,徐利敏.基于背景噪声估计的说话人识别算法[J].计算机工程,2008,34(14):14-16. 被引量：1
6俞一彪,袁冬梅,薛峰.一种适于说话人识别的非线性频率尺度变换[J].声学学报,2008,33(5):450-455. 被引量：12
7李燕萍,唐振民,钱博,张燕.基于PLAR特征补偿的鲁棒性说话人识别仿真研究[J].系统仿真学报,2009,21(2):409-412. 被引量：2
8李燕萍,唐振民,张燕,丁辉.基于自适应频率规整的鲁棒说话人辨认研究[J].中文信息学报,2009,23(4):88-94.
9YU Yibiao YUAN Dongmei XUE Feng.A non-linear frequency transform and its application to speaker recognition[J].Chinese Journal of Acoustics,2009,28(3):280-288. 被引量：1
10薛峰,俞一彪.倒谱域特征分量置信度分析及说话人识别应用[J].信号处理,2010,26(1):127-131. 被引量：4

同被引文献18

1Stylianou Y. Voice transformation: a survey. In: Proc.ICASSP, 2009:3585-3588.
2Stylianou Y, Toda T, Wu C H, Kain A, Rosec O. Introduc- tion to the special section on voice transformation. IEEE Audio, Speech, and Language Processing, 2010; 18(5): 909-911.
3Abe M, Nakamura S, Shikano K, Kuwabara H. Voice con- version through vector quantization. In: Proc. ICASSP, 1988:655-658.
4Desai S, Black W A, Yegnanarayana B, Prahallad K. Spec- tral mapping using artificial neural networks for voice con- version. IEEE Audio, Speech, and Language Processing: 2010; 18(5): 954-964.
5Hui Y, Steve Y. Perceptually weighted linear transforma- tions for voice conversion. In: Proc. Eurospeech, 2003: 2409-2412.
6Stylianou Yet al. Continuous probabilistic transform for voice conversion. IEEE Transactions on Speech and Audio Processing, 1998; 6(2): 131-142.
7Kain A B. High resolution voice transformation. Ph.D. dissertation, Oregon Health and Science University, 2001.
8Qiao Y, Minematsu N. Mixture of probabilistic linear re- gressions: a unified view of GMM-based mapping tech- niques. In: Proc. ICASSP, 2009:3913-3916.
9Toda T, Black W A, Tokuda K. Voice conversion based on maximum likelihood estimation of spectral parameter tra- jectory. IEEE Audio, Speech, and Language Processing, 2007; 15(8): 2222-2235.
10Helander E, Silen H, Virtanen T, Gabbouj M. Voice con- version using dynamic kernel partial least squares regres- sion. IEEE Audio, Speech, and Language Processing, 2012; 20(3): 806-817.

引证文献2

1李娜,曾向阳,乔宇,李志锋.采用动态核特征及贝叶斯最大后验估计的语音转换方法[J].声学学报,2015,40(3):455-461. 被引量：2
2苏力,尹琦.基于数字均衡的时变数字语音模型更新[J].电脑知识与技术,2020,16(7):269-271.

二级引证文献2

1王民,杨秀峰,要趁红.基于PSO优化GRNN的语音转换方法[J].计算机工程与科学,2018,40(4):752-756.
2张雄伟,苗晓孔,曾歆,孙蒙,曹铁勇.语音转换技术研究现状及展望[J].数据采集与处理,2019,34(5):753-770. 被引量：9

1吴锋,杨宜民.一种基于栅格模型的机器人路径规划算法[J].现代计算机,2012,18(3):7-9. 被引量：2
2王晓涛,吴纪桃.加移动因子的C-V模型[J].中国图象图形学报,2010,15(11):1603-1607. 被引量：3
3成新民,张迎,蒋云良.基于FVQMM的说话人识别[J].辽宁工程技术大学学报（自然科学版）,2007,26(5):719-722.
4熊华乔,郑建彬,詹恩奇,汪阳,华剑.基于说话人模型聚类的说话人识别[J].计算机工程与应用,2014,50(2):133-136. 被引量：2
5蔡园园,徐磊.基于改进C-V模型的图像分割算法[J].数据通信,2015(3):31-35.
6今贝贝.DIY你的闪盘[J].电脑,2003(6):51-52.
7李聪,封化民.基于多蚁型的蚁群聚类算法[J].北京电子科技学院学报,2012,20(4):6-12. 被引量：1
8朱宇轩.浅谈说话人识别方法[J].西部皮革,2016,38(10):19-19.
9陈世明,肖娟,李海英,聂森.基于引力场的粒子滤波算法[J].控制与决策,2017,32(4):709-714. 被引量：9
10朱学芳,黄奇,马仁配.基于语音识别的用户认证系统设计及其在电子商务中的应用[J].情报科学,2007,25(8):1223-1226. 被引量：3

声学学报

2011年第6期

浏览历史

内容加载中请稍等...

基于高斯混合模型移动因子补偿的说话人识别方法被引量：2

参考文献18

二级参考文献42

共引文献36

同被引文献18

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于高斯混合模型移动因子补偿的说话人识别方法 被引量：2

参考文献18

二级参考文献42

共引文献36

同被引文献18

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于高斯混合模型移动因子补偿的说话人识别方法被引量：2