期刊文献+

用于室内环境说话人识别的混响补偿方法

A Reverberation Compensation Method for Speaker Recognition in Rooms
下载PDF
导出
摘要 针对训练和识别环境不同而导致室内说话人识别系统识别率显著下降的问题,提出了一种基于差异化特征提取的混响补偿方法。与使用传统MFCC特征的识别阶段不同,该方法在训练阶段通过Schroeder反向积分在mel频带获得房间声能量衰减曲线,并使用该曲线补偿纯净信号的MFCC特征,以模拟实际室内混响场声信号特征;同时,通过联合应用相对谱滤波(RASTA)与倒谱均值规整(CMN)处理MFCC特征,进一步抑制房间通道效应对语音信号影响。针对不同混响程度房间中实测数据的识别结果表明,该方法可以显著提高识别率,具有良好的抑制混响作用。 To overcome the problem that the accuracy of speaker recognition systems in rooms descends rapidly as a result of the mismatch between training and testing environments, a differential feature extraction method based on reverberation compensation has been brought forward. Different from the recognition phase that uses traditional MFCCs, Schroeder inverse integration is applied to obtaining the energy decay curve in rooms, so that reverberation can be compensated for MFCC features of pure sound signals in training phase. Furthermore MFCCs are processed by CMN ( Cepstral Mean Normalization) and RASTA to suppress the room channel effect. The experimental results in different real rooms with various reverberation degrees and their analysis have shown preliminarily that the method we presented can enhance the recognition rate and performs well in suppressing the influence of reverberation.
作者 曾向阳 王强
出处 《西北工业大学学报》 EI CAS CSCD 北大核心 2015年第3期420-425,共6页 Journal of Northwestern Polytechnical University
基金 国家自然科学基金(11374241) 陕西省自然科学基金(2012JM1010)资助
关键词 协方差矩阵 能量衰减 实验 特征提取 识别控制系统 集成 混响 原理图 稳定性 测试 倒谱均值规整 混响补偿方法下MFCC特征识别 MFCC特征提取 相对谱滤波 混响补偿方法 混响模型 房间脉冲响应 Schroeder反向积分 说话人识别 covariance matrix energy dissipation experiments feature extraction identification ( control systems) integration reverberation schematic diagrams stability testing eepstral mean normalization ( CMN ) REMOS ( reverberationspeaker recognitionidentification of MFCC feature with reverberation compensation model models) RIR ( Room Impulse Response ) Schroeder inverse integration
  • 相关文献

参考文献14

  • 1Barker J, Emmanuel Vincent, Ning Ma, et al. The PASCAL CHIME Speech Separation and Recognition Challenge[ J]. Comput- er Speech & Language, 2013, 27(3) :621-633.
  • 2Castellano P J, Sridharan S, Cole D. Speaker Recognition in Reverbetation Enclosures [ C ]//IEEE International Conference on Acoustic Speech and Signal, 1996:117-120.
  • 3Habets E A P. Multi-Channel Speech Dereverberation Based on a Statistical Model of Late Reverberation [ C ]//IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing, 2005:173-176.
  • 4Patrick A Naylor, Nikolay D Gaubitch. Speech Dereverberation[ M]. London, Springer, 2010:2-8.
  • 5Hermansky H, Morgan N. RASTA Processing of Speech [ J ]. IEEE Trans on Speech and Audio Processing, 1994, 2 (4): 578-589.
  • 6Marcel Kockmann, Lukas Burget, Jan Honza Cernocky. Application of Speaker-and Language Identification State-of-the-Art Techniques for Emotion Recognition [ J ]. IEEE Trans on Audio Speech and Language Processing, 2011, 53 (9/10) : 1172-1185.
  • 7Ganapathy S, Peleeanos J, Omar M K. Feature Normalization for Speaker Verification in Room Reverberation[ C ] //IEEE Inter- national Conference on Acoustics, Speech, and Signal Processing, 2011:4836-4839.
  • 8Tazi E B, Benabbou A, Haiti M. Efficient Text Independent Speaker Identification Based on GFCC and CMN Methods[ C] // IEEE International Conference on Multimedia Computing and Systems, 2012:90-95.
  • 9杜俊,戴礼荣,王仁华.倒谱形状规整在噪声鲁棒性语音识别中的应用[J].中文信息学报,2010,24(2):104-109. 被引量:2
  • 10Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and Session Variability in GMM-Based Speaker Verification [ J ]. IEEE Trans on Audio Speech and Language Processing, 2007, 15(4) : 1448-1460.

二级参考文献15

  • 1丁沛,曹志刚.基于语音增强失真补偿的抗噪声语音识别技术[J].中文信息学报,2004,18(5):64-69. 被引量:3
  • 2Y. Gong. Speech Recognition in Noisy Environments: A Survey[J]. Speech Communication, 1995, 16(3): 261-291.
  • 3O. Viikki and K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recognition [J]. Speech Communication, 1998, 25 (1): 133-147.
  • 4C.-W. Hsu and L.-S. Lee. Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition [C]//IEEE Proc. of ICASSP, 2004: 197- 200.
  • 5B. Liu, L.-R. Dai, J.-Y. Li and R.-H. Wang. Double Gaussian Based Feature Normalization for Robust Speech Recognition [C]//Proc. of ISCSLP, 2004, 253-256.
  • 6A. de la Torre, J.C. Segura, C. Benitez, A.M. Peinado and A.J. Rubio. Non-linear Transformations of the Feature Space for Robust Speech Reeognition[C]//IEEE Proc. Of ICASSP, 2002: 401-404.
  • 7F. Hilger and H. Ney. Quantile Based HistogramEqualization for Noise Robust Speech Recognition [C]// Proc. of EUROSPEECH, 2001: 1135-1138.
  • 8S.-N. Tsai and L.-S. Lee. A New Feature Extraction Front-End for Robust Speech Recognition using Progressive Histogram Equalization and Multi- Eigenvector Temporal Filtering [C]//Proc. of ICSLP, 2004: 165-168.
  • 9S.-H. Lin, Y.-M. Yeh and B. Chen. Exploiting Polynomial-fit Histogram Equalization and Temporal Average for Robust Speech Recognition [C]//Proc, of ICSLP, 2006, 2522-2525.
  • 10S. Gazor and W. Zhang. Speech Probability Distribution [J]. IEEE Signal Processing Letters, 2003, 10 (7) : 204-207.

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部