期刊文献+

基于GMM-UBM模型的语言辨识研究 被引量:10

Automatic language identification based on GMM-UBM
下载PDF
导出
摘要 与说话人识别、连续语音识别相比,自动语言辨识是一个相对较新的研究,而且是一项较难的课题。本文给出了一种基于GMM-UBM模型的语言辨识系统,并利用OGI-TS电话语音库对算法的性能进行了测试,然后给出了实验结果。实验结果表明,该算法也是进行语言辨识的一种有效方法。 Compared with other speech technologies in speech processing, automatic language identification is a relatively new yet difficult problem. In this paper, a language identification algorithm is provided and some experiments are conducted using OGI-TS telephone speech corpus. Then experiments results are described. It is shown that GMM-UBM is another efficient method to language identification problems.
出处 《信号处理》 CSCD 2003年第1期85-88,共4页 Journal of Signal Processing
关键词 语音识别 语言辨识 GMM.UBM模型 计算机 gaussian mixture model universal background model bayesian adaptation
  • 相关文献

参考文献11

  • 1Y. K. Muthusamy, E. Barnard and R. A. Cole, "Reviewing Automatic Language Identification", IEEE Signal Processing Magazine, October 1994.
  • 2Berkling, K.M., Arai, T., Barnard, E., Cole, R.A., 1994.Analysis of phoneme-based features for language identification. In: International Conference on Acoustics,Speech, and Signal Processing, Vol. 1, Aprikl 1994, pp.289-292.
  • 3M.A. Zissman. Language identification using phoneme recognition phonotactic language modeling. In Proceedings 1995 IEEE International Conference onAcoustics,Speech, and Signal Processing, pages 3503- 3506, May 1995.
  • 4J. Narvratil and Wemer Zuhlke. Double bigramdecoding in Phonotactic language identification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 97, Munique,Germany, April 1997.
  • 5Y. K. Muthusamy, R. A. Cole, and B. T. Oshika. The OGI Multi-language telephone speech corpus. Technical report,Center for Spoken Language Understanding Oregon Graduate Institute of Science and Technology, Portland,1993.
  • 6D.A. Reynolds, T. E Quaffed, and R. B. Dunn. Speaker verification using adapted Gaussian mixture models.Digital Signal Processing, Vol. 10, pp 19-41, 2000.
  • 7D.A. Reynolds, and R.C. Rose, Rosust text-independence speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, vol.3, No. 1, pp72-83.
  • 8A. E. Rosenberg and S. Parthasarathy, Speaker background models for connected digit password speaker verification. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing,pp 81-84, 1996
  • 9J. L. Gauvain and C.H. Lee, Maximum a postedori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Trans. Speech Audio Process.Vol.2, pp 291-298,1994.
  • 10M. A. Zissman, "Comparison of four approaches to automatic language identification of telephone speech",IEEE Trans. Speech Audio Process. Vol. 4, pp 31-44.

同被引文献52

  • 1顾明亮,沈兆勇.基于语音配列的汉语方言自动辨识[J].中文信息学报,2006,20(5):77-82. 被引量:19
  • 2姜洪臣,郑榕,张树武,徐波.基于SDC特征和GMM-UBM模型的自动语种识别[J].中文信息学报,2007,21(1):49-53. 被引量:14
  • 3Petracca M,Servetti A, Demartin J C. Performance analysis of compressed-domain automatic speaker recognition as a function of speech coding technique and bit rate [C]//Proceedings of International Conference on Multimedia and Expo (ICME). Toronto, Canada:IEEE Press,2006:1393-1396.
  • 4Dunn R B, Quatieri T F, Reynolds D A, et al. Speaker recognition from coded speech in matched and mismatched conditions [C]//Proceedings of Speaker Recognition Workshop'1. Grete, Greece: [s.n.], 2001: 115-120.
  • 5Quatieri T F, Dunn R B, Reynolds D A, et al. Speaker recognition using G. 729 speech codec parameters [C]//Proceedings of IEEE, International Conference on Audio, Speech and Signal Processing. Istanbul, Turkey:IEEE Press, 2000: 1089-1093.
  • 6Aggarwal C C, Olshefski D, Saha D, et al. CSR: speaker recognition from compressed VoIP packet stream[C]//Proceedings of International Conference on Multimedia and Expo (ICME). Amsterdam, Holand : IEEE Press, 2005 : 970-973.
  • 7Petracca M, Servetti A, Demartin J C. Low-complextity automatic speaker recognition in the compressed GSM-AMR domain[C]//Proceedings of International Conference on Multimedia and Expo (ICME). Amsterdam, Holand:IEEE Press, 2005: 662-665.
  • 8ITU-T H. 323 2000. Packet-based multimedia communications systems[S]. Genevese: ITU-T,2000.
  • 9ITU-T Recommendation G. 729-1996. Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP)[S]. Helsinki.. WTSC Resolution, 1996.
  • 10ITU-T Recommendation G. 723.1-1996. Dual rate speech coder for multimedia communications trans- mitting at 5.3 and 6.3 kbit/s [S]. Helsinki: WTSC Resolution, 1996.

引证文献10

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部