期刊文献+

基于临界频带的交互性双支路单通道语音增强模型 被引量:1

Interactive Dual‑Branch Monaural Speech Enhancement Model Based on Critical Frequency Band
下载PDF
导出
摘要 针对目前主流的双支路单通道语音增强方法只关注全频带信息而忽略子频带信息这一问题,设计了一种基于人耳临界频带的交互性双支路模型。主要做法为,在复数谱支路上实施模拟人耳临界频带的划分方法对信号进行分频带处理,提取子带信息;在幅度补偿支路上直接对信号的全频带进行处理,提取全频带信息。复数谱支路负责初步恢复干净语音的幅度和相位,同时,该支路上学到的子带中间特征会被特定的模块传递给幅度补偿支路进行补偿;幅度补偿支路上的输出会对复数谱支路上输出的幅度做进一步的补偿,达到恢复干净语音频谱的目的。实验结果表明,提出的模型在恢复语音质量和可懂度方面优于其他先进的单通道语音增强模型。 Aiming at the problem that the current mainstream dual-branch single-channel speech enhancement methods only pay attention to the full frequency band information while ignoring the subband information,an interactive dual-branch model based on the critical frequency band of the human ear is proposed.The main method is to implement the division method of simulating the critical frequency band of the human ear on the complex spectrum branch to process the signal in frequency division and extract sub-band information.The whole frequency band of the signal is directly processed on the amplitude compensation branch,and the information of the whole frequency band is extracted.The complex spectrum branch is responsible for initially recovering the amplitude and phase of the clean speech signal.At the same time,the subband intermediate features learned by the branch are transferred to the amplitude compensation branch by specific modules for compensation.The output on the amplitude compensation branch will further compensate the amplitude of the output on the complex spectrum branch to achieve the purpose of recovering the clean speech spectrum.Experimental results show that the proposed model is superior to other advanced models in restoring speech quality and intelligibility.
作者 叶中付 赵紫微 于润祥 YE Zhongfu;ZHAO Ziwei;YU Runxiang(Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230022,China;National Engineering Research Center of Speech and Language Information Processing,Hefei 230022,China)
出处 《数据采集与处理》 CSCD 北大核心 2023年第2期262-273,共12页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61671418)。
关键词 临界频带 交互性 子带 双支路 单通道语音增强 critical frequency band interactive subband dual-branch monaural speech enhancement
  • 相关文献

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部