考虑帧间信息的语音带宽扩展

Speech bandwidth extension supported by temporal information

下载PDF

导出

摘要语音带宽扩展是为了提高语音质量,利用语音低频和高频之间的相关性重构语音高频的一种技术。高斯混合模型法是语音带宽技术中被广泛应用的一种方法,但是,该方法的映射函数是分段线性函数,且没有考虑语音前后帧的相关信息。因此,提出了一种基于条件受限玻尔兹曼机的方法。该方法利用条件受限玻尔兹曼机提取了语音信号的帧间信息,同时将语音低频、高频特征参数映射为高阶统计特性,深层发掘和模拟了语音低频和高频之间的非线性关系。客观和主观对比测试结果都表明,该方法性能优于传统的高斯混合模型方法。 Speech Bandwidth Extension （BWE） aims to improve the quality of speech by reconstructing the missing High Frequency （HF） components using the correlation that exists between the Low Frequency （LF） and HF of speech. The Gaussian Mixture Model （GMM） based methods are widely used. However, the derived mapping function by GMM is a piece-wise linear transformation and ignores the temporal information of speech. Thus, a novel BWE method is proposed for estimation of the HF parts of speech by exploiting Conditional Restricted Boltzmann Machines （CRBM）. The proposed method introduces CRBM to obtain time information and model deep non-linear relationships between the spectral envelope features of LF and HF by building high-order eigen spaces between the LF and HF of the speech signal. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.

作者王迎雪赵胜辉匡镜明 WANG Yingxue ZHAO Shenghui KUANG Jingming(School of Information and Electronic, Beijing Institute of Technology Beijing 100081 School of Computer Science, Carnegie Mellon University Pittsburgh 15213 US)

机构地区北京理工大学信息与电子学院卡内基梅隆大学计算机与工程学院

出处《声学学报》 EI CSCD 北大核心 2017年第3期370-376,共7页 Acta Acustica

基金瑞典爱立信课题资助

关键词帧间玻尔兹曼机高斯混合模型特征参数分段线性函数统计特性非线性关系方法性能距离值条件受限 Bandwidth Gaussian distribution Linear transformations Mathematical transformations Syntactics

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献1

1张勇,胡瑞敏.基于高斯混合模型的语音带宽扩展算法的研究[J].声学学报,2009,34(5):471-480. 被引量：7

二级参考文献14

1俞一彪,王朔中.基于互信息匹配模型的说话人识别[J].声学学报,2004,29(5):462-466. 被引量：8
2郎玥,赵胜辉,匡镜明.基于矢量量化的语音信号频带扩展[J].北京理工大学学报,2005,25(3):260-264. 被引量：4
3党辰,戴葵,王苏峰,刘芸,王志英.高频重建技术SBR的研究与实现[J].电子学报,2004,32(F12):189-191. 被引量：2
4俞一彪,王朔中.文本无关说话人识别的全特征矢量集模型及互信息评估方法[J].声学学报,2005,30(6):536-541. 被引量：7
5Jax P, Vary P. Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding. IEEE Communications Magazines, 2006; 44(5): 106--111.
6Geiser B, Jax P. Bandwidth extension for hierarchical speech and audio coding in ITU-T rec. G.729.1. IEEE Transactions on Audio, Speech and Language Processing, 2007; 15(8): 2496--2509.
7Dar Ghulam Raza, Cheung-Fat Chan. Enhancing quality of celp coded speech via wideband extension by using voic- ing GMM interpolation and HNM re-synthesis. Proceeding of IEEE International Conference on Acoustics, Speech~ Signal Processing. 2002; 4:1241--1244.
8Nakatoh Y, Tuushima M, Norimatsu T. Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceeding of EUROSPEECH, 1997; 9: 1643--1646.
9Enbom N, Klenijn W B. Bandwidth expansion of speech based on vector quantization of the reel frequency cepstral coefficients. IEEE Workshop on Speech Coding Proceedings, 1999; 2:171--173.
10Park K Y, Kim H S. Narrowband to wideband conversion of speech using GMM based transformation. Proceeding of IEEE International Conference on Acoustics, Speech, Signal Processing, 2000; 4:1843--1846.

共引文献6

1张兴涛,鲍长春,刘鑫,张丽燕.基于Volterra级数预测的音频频带扩展[J].电子学报,2012,40(12):2501-2506. 被引量：2
2邓峰,鲍长春,鲍枫.基于核Fisher判别和加权码书映射的音频信号削波修复方法[J].数据采集与处理,2014,29(2):211-221.
3张勇,刘轶.窄带语音带宽扩展算法研究[J].声学学报,2014,39(6):764-773. 被引量：5
4王迎雪,赵胜辉,于莹莹,匡镜明.基于受限玻尔兹曼机的语音带宽扩展[J].电子与信息学报,2016,38(7):1717-1723. 被引量：3
5白海钏,鲍长春,刘鑫.基于局部最小二乘支持向量机的音频频带扩展方法[J].电子学报,2016,44(9):2203-2210. 被引量：3
6郭雷勇,李宇,林胜义,谭洪舟.用于隐马尔可夫模型语音带宽扩展的激励分段扩展方法[J].计算机应用,2017,37(8):2416-2420. 被引量：5

1侯刚.VB对SQL Server数据库的访问解析[J].潍坊学院学报,2010,10(6):61-66. 被引量：1
2常晓娟,伊波.VB访问SQL Server数据库技术的探索[J].黑龙江科技信息,2009(13):60-60.
3周建强,刘怀,张海龙,李振.一种改进的背景提取与更新方法[J].南京师范大学学报（工程技术版）,2012,12(4):67-72.
4移动网络为什么需要遥距测试系统？[J].中国无线通信,2002,8(9):20-21.
5刘冬萍,潘莹玉.漫话IP电话[J].电气时代,2000(7):51-51.
6IP电话导致结构化布线不断变革[J].计算机网络世界,2004,13(1):29-31.
7王雪飞.IP电话发展综述[J].黑龙江通信技术,1999(2):9-12.
8崔建平.IP电话（VoP）语音质量的定量分析及测量[J].通信与电子测试,2001(2):9-13.
9潘巍,刘宏宇,安荣,杨娜菲,黄亦佳.一种梯度特征与区域合并的车牌定位方法[J].计算机工程与应用,2011,47(18):204-206. 被引量：4
10模糊神经网络[J].计算机应用：英文版,2005(4):34-37.

声学学报

2017年第3期

浏览历史

内容加载中请稍等...

考虑帧间信息的语音带宽扩展

参考文献1

二级参考文献14

共引文献6

相关作者

相关机构

相关主题

浏览历史