期刊文献+

基于聚类的双说话人混合语音分离

Cochannel Speech Separation Based on Clustering
下载PDF
导出
摘要 针对许多基于训练模型的计算机听觉场景分析系统,在解决双说话人混合语音信号分离时需要依赖样本训练的有效性以及说话人的先验知识,提出一种基于聚类的单声道混合语音分离系统。系统先利用多基音跟踪算法对语音信号进行分析并产生同时流,然后通过最大化类内散布矩阵与类间散布矩阵的迹,搜索同时流的最佳分类,最终完成对双说话人的语音分离。该系统不需要训练语音模型,并且有效地改善了在双说话人混合语音信号的分离效果,为双说话人的语音分离提供了新的思路。 This paper proposes an unsupervised clustering approach for coehannel speech separation to solve the problem that many auditory scene analysis (CASA) systems using training model to require the availability of pretrained speaker models and prior knowledge of participating speakers. The system produces simultaneous streams of mixture signal through multi-pitch track- ing algorithm, and searches for the optimal assignment of simultaneous speech streams by maximizing the between- and within- cluster scatter matrix ratio to separate the mixtures. The system does not require trained speaker models, improves obviously the performance of eoehannel separation, which offers a good solution to separate coehannel speech.
作者 吴春 梁正友
出处 《计算机与现代化》 2014年第4期86-88,共3页 Computer and Modernization
关键词 计算机听觉场景分析 双说话人语音分离 聚类 CASA eochannel speech separation clustering
  • 相关文献

参考文献14

  • 1Bregman A S. Auditory Scene Analysis: The Perceptual Organization of Sound[ M]. MIT press, 1994.
  • 2吴镇扬,张子瑜,李想,赵力.听觉场景分析的研究进展[J].电路与系统学报,2001,6(2):68-73. 被引量:9
  • 3Shao Y, Wang D L. Model-based sequential organization in cochannel speech [ J ]. IEEE Transactions on Audio, Speech, and Language Processing, 2006,14 ( 1 ) :289-298.
  • 4Barker J, Coy A, Ma N, et al. Recent advances in speech fragment decoding techniques [ C ]//Proceedings of Inter- speech. 2006 : 85-88.
  • 5Hershey J R, Rennie S J, Olsen P A, et al. Super-human multi-talker speech recognition: A graphical modeling ap- proach[J]. Computer Speech & Language, 2010,24(1) : 45 -66.
  • 6Weiss R J, Ellis D P W. Speech separation using speaker- adapted eigenvoice speech models [ J ]. Computer Speech & Language, 2010,24(1) :16-29.
  • 7Wang Deliang, Guy J Brown. Computational Auditory Scene Analysis : Principles, Algorithms, and Applications [ M ]. Wiley-IEEE Press, 2006.
  • 8Shao Y. Sequential Organization in Computational Auditory Scene Analysis [ D ]. The Ohio State University, 2007.
  • 9Jin Z, Wang D L. Reverberant speech segregation based on multipitch tracking and classification[ J ]. IEEE Trans- actions on Audio, Speech, and Language Processing, 2011.19(8) :2328-2337.
  • 10Narayanan A, Wang D L. Robust speech recognition from binary masks[ J]. The Journal of the Acoustical Society of America, 2010,128 ( 5 ) : EL217-EL222.

二级参考文献20

  • 1[1]Bregman A S. Auditory Scene Analysis[M]. MIT Press 1990.
  • 2[2]Weintraub M. A Theory and computational model of auditory monaural sound separation[D]. E. E. Dept., Stanford. 1985.
  • 3[3]Mellinger D K. Event formation and separation in musical sound[D]. CCRMA, Stanford, 1991.
  • 4[4]Cooke M P. Modeling auditory processing and organization[D]. Ph.D. thesis, CS Dept., Univ. of Sheffield, 1991.
  • 5[5]Patterson R D, Holdsworth J. A functional model of neural activity patterns and auditory images, Advances in speech, hearing andlanguage processing[M]. vol. 3, ed.: W. A. Ainsworth, JAI Press, London, 1990.
  • 6[6]Brown G J, Cooke M. Computational auditory scene analysis[J]. Computer Speech and Language, 1994, 8: 297-336.
  • 7[7]Nakatani T., Okuno H G, Kawabata T. Auditory stream segregation in auditory scene analysis with a multi-agent system[A]. Proc. Am.Assoc. Artif. Intel. Conf.[C], Seattle, 1994,100-107.
  • 8[8]Kashino K, et al. Application of Bayesian Probability Network to Music Scene Analysis[A]. Workshop on Comp. Aud. Scene Anal.,Int. Joint Conf. on Artif. Intell.[C], Montreal, 1995.
  • 9[9]Markus B. Binaural modeling and auditory scene analysis[A]. IEEE ASSP Workshop[C] 1995,15-18
  • 10[10]Brown G J, Wang DL. Modeling the perceptual segregation of vowels with a network of neural oscillators[R]. Memo CS-96-06,Sheffield Univ, 1996.

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部