期刊文献+

考虑帧间信息的语音带宽扩展

Speech bandwidth extension supported by temporal information
下载PDF
导出
摘要 语音带宽扩展是为了提高语音质量,利用语音低频和高频之间的相关性重构语音高频的一种技术。高斯混合模型法是语音带宽技术中被广泛应用的一种方法,但是,该方法的映射函数是分段线性函数,且没有考虑语音前后帧的相关信息。因此,提出了一种基于条件受限玻尔兹曼机的方法。该方法利用条件受限玻尔兹曼机提取了语音信号的帧间信息,同时将语音低频、高频特征参数映射为高阶统计特性,深层发掘和模拟了语音低频和高频之间的非线性关系。客观和主观对比测试结果都表明,该方法性能优于传统的高斯混合模型方法。 Speech Bandwidth Extension (BWE) aims to improve the quality of speech by reconstructing the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF of speech. The Gaussian Mixture Model (GMM) based methods are widely used. However, the derived mapping function by GMM is a piece-wise linear transformation and ignores the temporal information of speech. Thus, a novel BWE method is proposed for estimation of the HF parts of speech by exploiting Conditional Restricted Boltzmann Machines (CRBM). The proposed method introduces CRBM to obtain time information and model deep non-linear relationships between the spectral envelope features of LF and HF by building high-order eigen spaces between the LF and HF of the speech signal. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.
作者 王迎雪 赵胜辉 匡镜明 WANG Yingxue ZHAO Shenghui KUANG Jingming(School of Information and Electronic, Beijing Institute of Technology Beijing 100081 School of Computer Science, Carnegie Mellon University Pittsburgh 15213 US)
出处 《声学学报》 EI CSCD 北大核心 2017年第3期370-376,共7页 Acta Acustica
基金 瑞典爱立信课题资助
关键词 帧间 玻尔兹曼机 高斯混合模型 特征参数 分段线性函数 统计特性 非线性关系 方法性能 距离值 条件受限 Bandwidth Gaussian distribution Linear transformations Mathematical transformations Syntactics
  • 相关文献

参考文献1

二级参考文献14

  • 1俞一彪,王朔中.基于互信息匹配模型的说话人识别[J].声学学报,2004,29(5):462-466. 被引量:8
  • 2郎玥,赵胜辉,匡镜明.基于矢量量化的语音信号频带扩展[J].北京理工大学学报,2005,25(3):260-264. 被引量:4
  • 3党辰,戴葵,王苏峰,刘芸,王志英.高频重建技术SBR的研究与实现[J].电子学报,2004,32(F12):189-191. 被引量:2
  • 4俞一彪,王朔中.文本无关说话人识别的全特征矢量集模型及互信息评估方法[J].声学学报,2005,30(6):536-541. 被引量:7
  • 5Jax P, Vary P. Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding. IEEE Communications Magazines, 2006; 44(5): 106--111.
  • 6Geiser B, Jax P. Bandwidth extension for hierarchical speech and audio coding in ITU-T rec. G.729.1. IEEE Transactions on Audio, Speech and Language Processing, 2007; 15(8): 2496--2509.
  • 7Dar Ghulam Raza, Cheung-Fat Chan. Enhancing quality of celp coded speech via wideband extension by using voic- ing GMM interpolation and HNM re-synthesis. Proceeding of IEEE International Conference on Acoustics, Speech~ Signal Processing. 2002; 4:1241--1244.
  • 8Nakatoh Y, Tuushima M, Norimatsu T. Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceeding of EUROSPEECH, 1997; 9: 1643--1646.
  • 9Enbom N, Klenijn W B. Bandwidth expansion of speech based on vector quantization of the reel frequency cepstral coefficients. IEEE Workshop on Speech Coding Proceedings, 1999; 2:171--173.
  • 10Park K Y, Kim H S. Narrowband to wideband conversion of speech using GMM based transformation. Proceeding of IEEE International Conference on Acoustics, Speech, Signal Processing, 2000; 4:1843--1846.

共引文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部