期刊文献+

基于说话人聚类和高斯混合模型的语言辨识研究

Language Identification Using Speaker Clustering and Gaussian Mixed Model
下载PDF
导出
摘要 本文给出了一种语言辨识的新方法。通常来讲,语言辨识系统是说话人无关的,但说话人的个体特征对语言辨识系统有很大的影响,文本采用了一种粗分类精识别的思想,利用说话人聚类技术有效解决了粗分类的问题,对每类相近说话人集合建立模型,然后进行识别。实验表明,该方法对于说话人无关的语言辨识问题是有效的。 In this paper, a novel approach to language identification is proposed. Generally speaking, ideal automatic language identification system is speaker-independent, but the personal characteristics of speakers have an important influence on the performance of language identification systems. Here, we utilize an idea of rough classification and fine recognition, namely, the rough classification is realized by using speaker clustering, then the models are constructed based on each subset of speakers so as to perform fine recognition. Preliminary results on language identification are provided to demonstrate the effectiveness of such system.
出处 《信号处理》 CSCD 2004年第3期285-289,共5页 Journal of Signal Processing
基金 国家自然科学基金 项目批准号:60372038
关键词 语言辨识系统 高斯混合模型 说话人 聚类 语音信号处理 language identification gaussian mixed model speaker clustering
  • 相关文献

参考文献7

  • 1B.S. Atal. Automatic Recognition of Speakers from Their Voices. Proc. IEEE, 1976, 64(4): 460- 475.
  • 2M. Savic, E. Acosta and S. K. Gupta, "An automatic language identification system", in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 91, Toronto, Canada. May 1991.
  • 3D. Reynolds, Speaker identification and verification using Gaussian mixture models. In ESCA Workshop on Automatic Speaker Recognition, Identification, Verification,pp 27-30, 1994.
  • 4H. Jin, E Kubala and R.Schwartz. Automatic speaker clustering, Proceedings of the Speech Recognition Workshop, pp 108-111,1997.
  • 5屈丹,王炳锡,魏鑫.基于GMM-UBM模型的语言辨识研究[J].信号处理,2003,19(1):85-88. 被引量:10
  • 6D. A. Reynolds, T. E Quafferi, and R. B. Dunn. Speaker verification using adapted Gaussian mixture models.IEEE Transactions on Speech and Audio Processing, Vol.3, No. 1, pp 72-83.
  • 7Y. K. Muthusamy, R. A. Cole, and B. T. Oshika. The OGI Multi-languge telephone speech corpus. Technical report,Center for Spoken Language Understanding Oregon Graduate Institute of Science and Technology, Portland,1993.

二级参考文献11

  • 1Y. K. Muthusamy, E. Barnard and R. A. Cole, "Reviewing Automatic Language Identification", IEEE Signal Processing Magazine, October 1994.
  • 2Berkling, K.M., Arai, T., Barnard, E., Cole, R.A., 1994.Analysis of phoneme-based features for language identification. In: International Conference on Acoustics,Speech, and Signal Processing, Vol. 1, Aprikl 1994, pp.289-292.
  • 3M.A. Zissman. Language identification using phoneme recognition phonotactic language modeling. In Proceedings 1995 IEEE International Conference onAcoustics,Speech, and Signal Processing, pages 3503- 3506, May 1995.
  • 4J. Narvratil and Wemer Zuhlke. Double bigramdecoding in Phonotactic language identification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 97, Munique,Germany, April 1997.
  • 5Y. K. Muthusamy, R. A. Cole, and B. T. Oshika. The OGI Multi-language telephone speech corpus. Technical report,Center for Spoken Language Understanding Oregon Graduate Institute of Science and Technology, Portland,1993.
  • 6D.A. Reynolds, T. E Quaffed, and R. B. Dunn. Speaker verification using adapted Gaussian mixture models.Digital Signal Processing, Vol. 10, pp 19-41, 2000.
  • 7D.A. Reynolds, and R.C. Rose, Rosust text-independence speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, vol.3, No. 1, pp72-83.
  • 8A. E. Rosenberg and S. Parthasarathy, Speaker background models for connected digit password speaker verification. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing,pp 81-84, 1996
  • 9J. L. Gauvain and C.H. Lee, Maximum a postedori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Trans. Speech Audio Process.Vol.2, pp 291-298,1994.
  • 10M. A. Zissman, "Comparison of four approaches to automatic language identification of telephone speech",IEEE Trans. Speech Audio Process. Vol. 4, pp 31-44.

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部