摘要
语音带宽扩展是为了提高语音质量,利用语音低频和高频之间的相关性重构语音高频的一种技术。高斯混合模型法是语音带宽技术中被广泛应用的一种方法,但是,该方法的映射函数是分段线性函数,且没有考虑语音前后帧的相关信息。因此,提出了一种基于条件受限玻尔兹曼机的方法。该方法利用条件受限玻尔兹曼机提取了语音信号的帧间信息,同时将语音低频、高频特征参数映射为高阶统计特性,深层发掘和模拟了语音低频和高频之间的非线性关系。客观和主观对比测试结果都表明,该方法性能优于传统的高斯混合模型方法。
Speech Bandwidth Extension (BWE) aims to improve the quality of speech by reconstructing the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF of speech. The Gaussian Mixture Model (GMM) based methods are widely used. However, the derived mapping function by GMM is a piece-wise linear transformation and ignores the temporal information of speech. Thus, a novel BWE method is proposed for estimation of the HF parts of speech by exploiting Conditional Restricted Boltzmann Machines (CRBM). The proposed method introduces CRBM to obtain time information and model deep non-linear relationships between the spectral envelope features of LF and HF by building high-order eigen spaces between the LF and HF of the speech signal. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.
作者
王迎雪
赵胜辉
匡镜明
WANG Yingxue ZHAO Shenghui KUANG Jingming(School of Information and Electronic, Beijing Institute of Technology Beijing 100081 School of Computer Science, Carnegie Mellon University Pittsburgh 15213 US)
出处
《声学学报》
EI
CSCD
北大核心
2017年第3期370-376,共7页
Acta Acustica
基金
瑞典爱立信课题资助