用于室内环境说话人识别的混响补偿方法

A Reverberation Compensation Method for Speaker Recognition in Rooms

下载PDF

导出

摘要针对训练和识别环境不同而导致室内说话人识别系统识别率显著下降的问题,提出了一种基于差异化特征提取的混响补偿方法。与使用传统MFCC特征的识别阶段不同,该方法在训练阶段通过Schroeder反向积分在mel频带获得房间声能量衰减曲线,并使用该曲线补偿纯净信号的MFCC特征,以模拟实际室内混响场声信号特征;同时,通过联合应用相对谱滤波(RASTA)与倒谱均值规整(CMN)处理MFCC特征,进一步抑制房间通道效应对语音信号影响。针对不同混响程度房间中实测数据的识别结果表明,该方法可以显著提高识别率,具有良好的抑制混响作用。 To overcome the problem that the accuracy of speaker recognition systems in rooms descends rapidly as a result of the mismatch between training and testing environments, a differential feature extraction method based on reverberation compensation has been brought forward. Different from the recognition phase that uses traditional MFCCs, Schroeder inverse integration is applied to obtaining the energy decay curve in rooms, so that reverberation can be compensated for MFCC features of pure sound signals in training phase. Furthermore MFCCs are processed by CMN （ Cepstral Mean Normalization） and RASTA to suppress the room channel effect. The experimental results in different real rooms with various reverberation degrees and their analysis have shown preliminarily that the method we presented can enhance the recognition rate and performs well in suppressing the influence of reverberation.

作者曾向阳王强

机构地区西北工业大学航海学院

出处《西北工业大学学报》 EI CAS CSCD 北大核心 2015年第3期420-425,共6页 Journal of Northwestern Polytechnical University

基金国家自然科学基金(11374241) 陕西省自然科学基金(2012JM1010)资助

关键词协方差矩阵能量衰减实验特征提取识别控制系统集成混响原理图稳定性测试倒谱均值规整混响补偿方法下MFCC特征识别 MFCC特征提取相对谱滤波混响补偿方法混响模型房间脉冲响应 Schroeder反向积分说话人识别 covariance matrix energy dissipation experiments feature extraction identification （ control systems） integration reverberation schematic diagrams stability testing eepstral mean normalization （ CMN ） REMOS （ reverberationspeaker recognitionidentification of MFCC feature with reverberation compensation model models） RIR （ Room Impulse Response ） Schroeder inverse integration

分类号 TP391.42 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献14

1Barker J, Emmanuel Vincent, Ning Ma, et al. The PASCAL CHIME Speech Separation and Recognition Challenge[ J]. Comput- er Speech & Language, 2013, 27(3) :621-633.
2Castellano P J, Sridharan S, Cole D. Speaker Recognition in Reverbetation Enclosures [ C ]//IEEE International Conference on Acoustic Speech and Signal, 1996:117-120.
3Habets E A P. Multi-Channel Speech Dereverberation Based on a Statistical Model of Late Reverberation [ C ]//IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing, 2005:173-176.
4Patrick A Naylor, Nikolay D Gaubitch. Speech Dereverberation[ M]. London, Springer, 2010:2-8.
5Hermansky H, Morgan N. RASTA Processing of Speech [ J ]. IEEE Trans on Speech and Audio Processing, 1994, 2 (4): 578-589.
6Marcel Kockmann, Lukas Burget, Jan Honza Cernocky. Application of Speaker-and Language Identification State-of-the-Art Techniques for Emotion Recognition [ J ]. IEEE Trans on Audio Speech and Language Processing, 2011, 53 (9/10) : 1172-1185.
7Ganapathy S, Peleeanos J, Omar M K. Feature Normalization for Speaker Verification in Room Reverberation[ C ] //IEEE Inter- national Conference on Acoustics, Speech, and Signal Processing, 2011:4836-4839.
8Tazi E B, Benabbou A, Haiti M. Efficient Text Independent Speaker Identification Based on GFCC and CMN Methods[ C] // IEEE International Conference on Multimedia Computing and Systems, 2012:90-95.
9杜俊,戴礼荣,王仁华.倒谱形状规整在噪声鲁棒性语音识别中的应用[J].中文信息学报,2010,24(2):104-109. 被引量：2
10Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and Session Variability in GMM-Based Speaker Verification [ J ]. IEEE Trans on Audio Speech and Language Processing, 2007, 15(4) : 1448-1460.

二级参考文献15

1丁沛,曹志刚.基于语音增强失真补偿的抗噪声语音识别技术[J].中文信息学报,2004,18(5):64-69. 被引量：3
2Y. Gong. Speech Recognition in Noisy Environments: A Survey[J]. Speech Communication, 1995, 16(3): 261-291.
3O. Viikki and K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recognition [J]. Speech Communication, 1998, 25 (1): 133-147.
4C.-W. Hsu and L.-S. Lee. Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition [C]//IEEE Proc. of ICASSP, 2004: 197- 200.
5B. Liu, L.-R. Dai, J.-Y. Li and R.-H. Wang. Double Gaussian Based Feature Normalization for Robust Speech Recognition [C]//Proc. of ISCSLP, 2004, 253-256.
6A. de la Torre, J.C. Segura, C. Benitez, A.M. Peinado and A.J. Rubio. Non-linear Transformations of the Feature Space for Robust Speech Reeognition[C]//IEEE Proc. Of ICASSP, 2002: 401-404.
7F. Hilger and H. Ney. Quantile Based HistogramEqualization for Noise Robust Speech Recognition [C]// Proc. of EUROSPEECH, 2001: 1135-1138.
8S.-N. Tsai and L.-S. Lee. A New Feature Extraction Front-End for Robust Speech Recognition using Progressive Histogram Equalization and Multi- Eigenvector Temporal Filtering [C]//Proc. of ICSLP, 2004: 165-168.
9S.-H. Lin, Y.-M. Yeh and B. Chen. Exploiting Polynomial-fit Histogram Equalization and Temporal Average for Robust Speech Recognition [C]//Proc, of ICSLP, 2006, 2522-2525.
10S. Gazor and W. Zhang. Speech Probability Distribution [J]. IEEE Signal Processing Letters, 2003, 10 (7) : 204-207.

共引文献1

1高宝明,孙国繁,冯俊杰,段雨松,刘霄,杨爱民.面向变电站智能运检的声音谱特征语音识别方法[J].高压电器,2023,59(11):40-47. 被引量：3

1杜开初.用单片机资源实现快速高精度模数转换[J].电子技术应用,1995,21(9):7-8. 被引量：4
2杜开初.用单片机资源实现快速高精度模数转换[J].龙岩师专学报,1995,13(3):59-61. 被引量：2
3谢秋云,肖铁军.语音MFCC特征提取的FPGA实现[J].计算机工程与设计,2008,29(21):5474-5475. 被引量：7
4李静,赵丽,任淑艳,段海龙,杨丽.sEMG识别控制系统在虚拟仪器平台上的实现[J].机床与液压,2011,39(13):72-74. 被引量：3
5陈丹,李京华,黄根全,许俊峰.基于主分量分析的声信号特征提取及识别研究[J].声学技术,2005,24(1):39-41. 被引量：12
6李薇,杨庆华.独立成分分析应用于人脸识别中的几个问题[J].计算机与现代化,2011(2):11-13.
7现场总线技术的产生[J].可编程控制器与工厂自动化（PLC FA）,2005(5):84-84.
8王进,王国萍.现场总线技术及其系统的应用[J].商业文化（学术版）,2008,0(7):218-218. 被引量：1
9徐华兴,夏日升,李军锋,颜永红.一种基于物理特性和感知特性的混响模拟方法[J].中国科学：信息科学,2015,45(6):817-826. 被引量：5
10赵欢,张林,陈珍文.混合窗函数和子带频谱质心在MFCC特征提取过程中的应用[J].计算机应用,2009,29(2):389-391. 被引量：1

西北工业大学学报

2015年第3期

浏览历史

内容加载中请稍等...

用于室内环境说话人识别的混响补偿方法

参考文献14

二级参考文献15

共引文献1

相关作者

相关机构

相关主题

浏览历史