期刊文献+

基于非负矩阵分解的2kb/s波形内插语音编码算法 被引量:5

2kb/s Waveform Interpolation Speech Coding Based on Non-negative Matrix Factorization
下载PDF
导出
摘要 在波形内插(Waveform Interpolation,WI)语音编码器中,如何低延时、高精度并且低复杂度的分解和量化特征波形(Characteristic Waveform,CW)一直是该编码模型的研究热点和难点.本文提出用非负矩阵分解(Non-negative MatrixFactorization,NMF)方法来分解语音特征波形.该分解方法仅需要当前帧的语音信号,不会给编码器带来额外的延时;为了提高分解精度,本文在CW分解之前先对CW按照其子帧的最大基音周期进行分类,然后按不同类别进行分解.另外,本文结合耳蜗模型提出了NMF的基矢量分带初始化算法,将CW的分解精度提高到与二阶奇异值分解相当的水平;为了降低WI编码器的计算复杂度,本文去除了传统WI编码器中的特征波形对齐模块,同时将NMF的分解阶数设定为16以折中CW分解的计算复杂度和分解精度.最后,本文基于矩阵量化技术,对非负矩阵分解后的编码矩阵采用分裂式矩阵量化方案来量化.主观A/B测试表明,本文提出的2kb/s NMF-WI编码器的合成语音质量接近于2.4kb/s SVD-WI编码器.MOS分测试表明,本文提出的2kb/s NMF-WI编码器的合成语音质量稍差于2.4kb/s MELP编码器. In WI coding scheme, how to decompose and quantize the characteristic wavefonns with low delay, low complexity and high precision have always been a hot research topic. The characteristic waveform decomposition based on non-negative matrix factorization is proposed in this paper. This CW decomposition method doesn't bring any additional delay to WI coder;In order to improve decomposition precision, the CW is firstly classified according to the maximum pitch of its sub-frames before being decomposed. Besides, band-partitioning initialization constraints are set to basis vectors before NMF is carried out, and this has made the CW decomposition precision of NMF-based method be comparable with that of 2 ranks of SVD; In order to reduce the computational complexity of WI coder, the CW alignment procedure is removed in our NMF-WI coder, and the factorization rank of NMF is set to 16 as a trade-off between computational complexity and decomposition precision.In the end,the low dimensional coding matrix is quantized by splitting matrix quantization scheme. The subjective A/B listening tests show that the proposed 2kb/s NMF-WI coder can give smooth speech with quality close to 2.4kb/s SVD-based WI coder.Mean Opinion Score test results indicate that the performance of proposed coder is a little worse than that of 2.4kbps MELP coder.
出处 《电子学报》 EI CAS CSCD 北大核心 2008年第4期632-638,共7页 Acta Electronica Sinica
基金 国家自然科学基金(No.60372063) 北京市自然科学基金(No.4042009) 北京市教委科技发展计划(No.KM200710005001)
关键词 语音编码 波形内插 特征波形 非负矩阵分解 speech coding waveform interpolation characteristic waveform non-negative matrix factorization
  • 相关文献

参考文献14

  • 1Kleijn W B.continuous representation in linear predictve coding[A].IEEE ICASSP'91[C].Toronto,1991.201-204.
  • 2Kleijn W B,Haagen J.Waveform Interpolation for Coding and Syria.Speech coding and Synthesis[M].Holland:Elsevier Science,1995.175-207.
  • 3Kleijn W B,Haagen J.Transformation and decomposition of the speech signal for coding[J].IEEE,Signal processing Letters,1994,1(9):136-139.
  • 4N R Chong,I S Bumett,J F Chicharo.A new waveform interpolation coding scheme based on pitch synchronous wavelet transform decomposition[J].IEEE Transactions on Speech and Audio Processing,2000,8(3):345-348.
  • 5J Lukasiak,I S Burnett.Scalable decomposition of speech waveforms[A].2002 IEEE Speech Coding Workshop Proceedings[C].Tsukuba City,Ibaraki,Japan,2002.135-137.
  • 6王贵平,鲍长春,张鹏.基于奇异值分解的低速率波形内插语音编码算法[J].电子学报,2006,34(1):135-140. 被引量:13
  • 7Guiping WANG,Chang-chun BAO.Low complexity decomposition for the characteristicwaveform of speech signal[A].ISCSLP2004[C].Hong Kong:IEEE Press,2004.145-149.
  • 8C H Ritz,I S Bumett,J Lukasiak.Very low rate speech coding using temporal decomposition and waveform interpolation[A].IEEE Workshop on Speech Coding Proceedings[C].Wisconsin:IEEE Press,2000.29-31.
  • 9D D Lee,H S Seung.Learning the parts of objects by non-negative matrix factorization[J].Nature,1999,401 (6755):788-791.
  • 10Sven Behnke.Discovering hierarchical speech features usingconvolutional non-negative matrix factorization[A].International Joint Conference on Neural Network 2003[C].Portland,United States:IEEE Presss,2003.2758-2763.

二级参考文献14

  • 1徐仲 张凯院 陆全.矩阵论简明教程[M].北京:科学出版社,2002.140-143.
  • 2Kleijn W B.Continuous representation in linear predictive coding[A].IEEE ICASSP'91[C].Toronto,1991.201-204.
  • 3Kleijn W B,Haagen J.Waveform Interpolation for Coding and Synthesis.Speech coding and Synthesis[M].Elsevier Science,1995.175-207.
  • 4Kleijn W B,Haagen J.Transformation and decomposition of the speech signal for coding[J].IEEE Signal Processing Letters,1994,1(9):136-139.
  • 5Chong NR,Burnett IS,Chicharo JF.A new waveform interpolation coding scheme based on pitch synchronous wavelet transform dcomposition[J].IEEE Transactions on speech and audio processing.2000,8(3):345-348.
  • 6Lukasiak J,Burnett IS.Scalable decomposition of speech waveforms[A].2002 IEEE Speech Coding Workshop Proceedings[C].Tsukuba City,Ibaraki,Japan,2002.135-137.
  • 7Guiping Wang,Changchun Bao.Low complexity decomposition for the characteristic waveform of speech signal[A].ISCSLP2004[C].Hong Kong.2004.145-149.
  • 8KLEIJN W B,HAAGEN J.Transformation and decomposition of the speech signal for coding[J].IEEE Signal Processing Letters,1994,1(9):136-139.
  • 9GOTTESMAN O,GERSHO A.Enhanced waveform interpolative coding at low bit-rate[J].IEEE Trans Speech and Audio Processing,2001,9(8):786-798.
  • 10KLEIJN W B,SHOHAM Y,SEN D,et al.A low-complexity waveform interpolation coder[C]//ICASSP' 96.Atlanta:IEEE,1996:212-215.

共引文献13

同被引文献62

引证文献5

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部