摘要
针对近讲系统的声学场景,提出一种基于听感知特性的双麦克风语音增强算法。模拟人耳频率分解特性,用gammatone滤波器组对2路麦克风采集的声音信号进行多子带频率分解;对分解后的时域信号进行分帧,生成时频单元,并计算每个时频单元的能量;以2路信号时频单元能量比值为线索,估计每个时频单元信噪比,模拟人耳掩蔽特性生成掩蔽模板,并作用于带噪声的语音信号,实现目标语音与环境噪声的分离。实验结果表明:由2路麦克风信号时频单元能量的比值可较准确估计时频单元的信噪比;该算法可提高带babble噪声命令词的识别正确率,优于当前单通道及双通道语音增强算法。
A dual-microphone speech enhancement algorithm was developed for close-talk applications using two microphones. Gammatone filterbanks were used to decompose the sound into multi-frequency channels. The decomposed signals were framed and analyzed to calculate the energy of each time-frequency (T-F) unit. The energy ratio between the two microphones was used to estimate the signal noise ratio and the binary mask for the T-F unit. Finally, the binary mask was used to separate the target speech from the mixture. Tests show that this algorithm accurately estimates the signal-to-noise ratio for the T-F units. The speech enhancement algorithm improves the recognition accuracy of command sentences with noise and is superior to single channel and dual channel speech enhancement algorithms.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2014年第9期1179-1183,共5页
Journal of Tsinghua University(Science and Technology)