
基于计算听觉场景分析的混合语音信号分离算法研究 被引量:6

Research on speech separation based on computational auditory scene analysis
摘要 人耳听觉系统能够在强噪声的环境下区分出自己感兴趣的语音,基于计算听觉场景分析(CASA)的基本原理,其重点和难点是找到合适的声音分离线索,完成目标语音信号和噪声信号的分离。针对单通道浊语音分离的问题,提出了一种以基音为线索的浊语音信号分离算法。在白噪声、鸡尾酒会噪声等六种噪声干扰条件下,通过仿真实验结果表明,相比于传统的谱减法,语音分离算法的输出信噪比平均提高了7.47 d B,并有效抑制了干扰噪声,改善了分离效果。 The human auditory system shows a remarkable capacity for speech segregation. Based on the principle of computational auditory scene analysis( CASA),the vital task was to find the right auditory cues to separate target voice. For monaural speech segregation,this paper proposed voiced speech separation algorithm based on pitch clues. Under the condition of six different noises,such as white noise,cocktail party noise,the experimental results show that the average target speech SNR is improved by 7. 47 d B compared with the traditional spectral subtraction,this proposed algorithm also effectively restrained the noise and improved the performance of voiced separation.
出处 《计算机应用研究》 CSCD 北大核心 2014年第12期3822-3824,共3页 Application Research of Computers
基金 山西省自然科学基金资助项目(2013011016-1) 国家教育部博士点基金资助项目(2011081047)
关键词 语音分离 计算听觉场景分析 基音 分段 听觉流 speech separation computational auditory scene analysis pitch segmentation auditory stream
  • 相关文献



  • 1程俊,张璞,戴善荣,易克初.小波变换用于信号突变的检测[J].通信学报,1995,16(3):96-104. 被引量:36
  • 2JWAndr.A comparison of auditory and blind separation techniques for speech separation [J].IEEE Trans on Speech and Audio Processing,2001,9(3):189-195.
  • 3[1]Bregman A S. Auditory Scene Analysis[M]. MIT Press 1990.
  • 4[2]Weintraub M. A Theory and computational model of auditory monaural sound separation[D]. E. E. Dept., Stanford. 1985.
  • 5[3]Mellinger D K. Event formation and separation in musical sound[D]. CCRMA, Stanford, 1991.
  • 6[4]Cooke M P. Modeling auditory processing and organization[D]. Ph.D. thesis, CS Dept., Univ. of Sheffield, 1991.
  • 7[5]Patterson R D, Holdsworth J. A functional model of neural activity patterns and auditory images, Advances in speech, hearing andlanguage processing[M]. vol. 3, ed.: W. A. Ainsworth, JAI Press, London, 1990.
  • 8[6]Brown G J, Cooke M. Computational auditory scene analysis[J]. Computer Speech and Language, 1994, 8: 297-336.
  • 9[7]Nakatani T., Okuno H G, Kawabata T. Auditory stream segregation in auditory scene analysis with a multi-agent system[A]. Proc. Am.Assoc. Artif. Intel. Conf.[C], Seattle, 1994,100-107.
  • 10[8]Kashino K, et al. Application of Bayesian Probability Network to Music Scene Analysis[A]. Workshop on Comp. Aud. Scene Anal.,Int. Joint Conf. on Artif. Intell.[C], Montreal, 1995.



  • 1王川,马俊,周哲,曾繁洋.声音传感器在地下电缆防外破的应用与发展[J].功能材料与器件学报,2021,27(5):425-430. 被引量:6
  • 2张重远,罗世豪,岳浩天,王博闻,刘云鹏.基于Mel时频谱-卷积神经网络的变压器铁芯声纹模式识别方法[J].高电压技术,2020,46(2):413-423. 被引量:71
  • 3朱浩冰,郭东辉.声纹识别系统原理及其关键技术[J].计算机安全,2007(9):14-17. 被引量:15
  • 4BROWN G J,COOKE M.Computational auditory scene analysis[J].Computer Speech&Language,1994,8(4):297-336.
  • 5HU K,WANG D L.An unsupervised approach to cochannel speech separation[J].IEEE Transactions on Audio,Speech and Language Processing,2013,21(1):120-129.
  • 6JIANG Y,WANG D L,LIU R S,et al.Binaural classification for reverberant speech segregation using deep neural networks[J].IEEE Transactions on Audio,Speech and Language Processing,2014,22(12):2112-2121.
  • 7HU G N,WANG D L.Segregation of unvoiced speech from nonspeech interference[J].Journal of the Acoustical Society of America,2008,124(2):1306-1319.
  • 8HU K,WANG D L.Incorporating spectral subtraction and noise type for unvoiced speech segregation[C].Proceedings of IEEE International Conference on Acoustics,Speech,and Signal Processing,2009:4425-4428.
  • 9HU K,WANG D L.Unvoiced speech segregation from nonspeech interference via CASA and spectral subtraction[J].IEEE Transactions on Audio,Speech and Language Processing,2011,19(6):1600-1609.
  • 10HU G N,WANG D L.A tandem algorithm for pitch estimation and voiced speech segregation[J].IEEE Transactions on Audio,Speech and Language Processing,2010,18(8):2067-2079.










使用帮助 返回顶部