期刊文献+

基于CASA的噪声环境下的话者辨认

CASA-based Speaker Identification in Specific Noisy Environment
下载PDF
导出
摘要 传统的说话人识别系统在噪声环境下的识别率较低.基于计算听觉场景分析得到的二值掩码可以对噪声占主导部分进行重建,从而将与说话人相关的被破坏的信息重建起来.但是重建的效果受到该帧中可靠帧的比例的影响.因此,根据提取的二值掩码来设定阈值,从而对测试特征的帧进行选取,将测试特征的帧划分为三类,分别用于重建、保留和丢弃.最终使用重建后的帧和保留的帧进行后续处理,并用于识别过程.实验结果表明,相较于原来的重建系统,该算法的识别率有了一定的提高. Conventional sperker recognition system perform pooly under noisy conditions. The extracted Binary Mask based on Computational auditory scene analysis can reconstruct the noise dominanted part of the speech, so that the information which is related to the speaker and destroyed can be rebuilt. However, the result is affected by the ratio of the reliable of the frame. Therefore, this paper set a threshold based on the extracted binary mask and use the threadshold to select frames. The frame is divided into three respectively, for reconstruction, retain and discard. The reconstructed and the retained frame will be used to identification. Experimental results show that compared to the original reconstruction system, the recognition rate of the algorithm has been improved.
出处 《小型微型计算机系统》 CSCD 北大核心 2016年第5期1107-1111,共5页 Journal of Chinese Computer Systems
关键词 计算听觉场景分析 Gammatone频率倒谱系数(GFCC) 理想二值掩码(IBM) 阈值 computational auditory sense analysis ( CASA ) gammatone frequency cepstral coefficient ( GFCC ) ideal binary mask (IBM) threshold
  • 相关文献

参考文献2

二级参考文献26

  • 1赵鹤鸣,葛良,陈雪勤,俞一彪.基于声音定位和听觉掩蔽效应的语音分离研究[J].电子学报,2005,33(1):158-160. 被引量:16
  • 2S Furui. Digital Speech Processing, Synthesis, and Recognition [ M]. New York: Marcel Dekker, 2001.
  • 3H Gish, M Schmidt. Text-independent speaker identification [ J]. IEEE Signal Proc, 1994,11 (4): 18 - 32.
  • 4D A Reynolds, et al. The SuperSID project: Exploiting high- level information for high-accuracy speaker recognition [ A ]. International Conference on Acoustics, Speech, and Signal Processing[ C]. Hong Kong, China: IEEE, 2003.4:784 - 787.
  • 5A Drygajlo,M El-Maliki. Speaker verification in noisy environments with combined spectral subtraction and missing feature theory [ A ]. IEEE International Conference on Acoustics, Speech, and Signal Processing[ C]. Seattle, USA: IEEE, 1998. 1 : 121 - 124.
  • 6SHAO Y, WANG D L. Robust speaker recognition using binary time-frequency masks [ A ]. IEEE International Conference on Acoustics,Speech,and Signal Processing[ C]. Toulouse: IEEE, 2006.1:645-648.
  • 7Z Wanfeng, Y Yingchun, W Zhaohui, S Lifeng. Experimental evaluation of a new speaker identification framework using PCA[ A]. IEEE. International Conference on Systems, Man and Cybernetics[C]. Washington, DC: IEEE., 2003.4147 - 4152.
  • 8WU Xihong. A Chinese Speech Database for Speaker Recognition[ EB/OL]. http://nlpr-web. ia. ac. cn/englisb_/irds/chinese / sinobiometrics- pdf/wuxihong.pdf, 2002.
  • 9D A Reynolds, R C Rose. Robust text-independent speaker identification using Gaussian mixture speaker models[ J].Proc IEEE. Trans Speech Audio Process, 1995,3 ( 1 ) : 72 - 83.
  • 10YOUNG S, EVERMANN G, GALES M, et al. The HTK Book[ M]. Cambridge: Cambridge University, 2006.

共引文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部