能量参数解码端HMM估计算法

HMM estimation of energy contours in speech decoders

导出

摘要在低速率语音编码算法中,如何对特征参数进行有效的量化表示是影响声码器合成语音质量的关键因素。该文提出一种能量参数解码端恢复算法,它利用线谱频率(linespectral frequency,LSF)和清浊音判决参数(unvoiced/voiced decision,U/V)估计能量参数的变化轨迹。该算法利用特征参数之间的相关性,采用隐Markov模型(hiddenMarkov model,HMM)描述LSF、U/V和能量参数之间的统计特性,通过对能量进行解码端恢复,省去量化所需的比特数,从而提高特征参数的整体量化性能。测试结果表明:能量参数解码端恢复算法能够将150b/s混合激励线性预测编码算法(mixed excitation linear prediction,MELP)的合成语音平均意见得分(mean opinion score,MOS)提高0.042。该算法应用于超低速率声码器参数量化是可行的。 Low bit rate speech coding must effectively quantize the parameters.This article presents an energy contour estimation algorithm to predict changes of speech energy from the line spectral frequency（LSF） and the unvoiced/voiced（U/V） decision parameters.The statistical properties of the energy,the LSF and the U/V decision parameters are characterized based on the hidden Markov model（HMM） which uses the correlations between different parameters.The algorithm properly estimates the energy contour,which contributes to quantization of the decoder parameters.Tests show that the energy contour estimation algorithm improves the mean opinion score（MOS） of the synthesized speech for the mixed excitation linear prediction（MELP） vocoder at a 150 b/s bit rate by 0.042,which shows that this algorithm improves parameter quantization in ultra low bit rate vocoders.

作者计哲高圣翔唐昆金鑫

机构地区清华大学电子工程系国家计算机网络与信息安全管理中心

出处《清华大学学报（自然科学版）》 EI CAS CSCD 北大核心 2013年第6期869-872,共4页 Journal of Tsinghua University(Science and Technology)

基金国家"八六三"高技术项目(2011AA010601)

关键词语音信号处理能量参数隐MARKOV模型线谱频率参数参数编码 speech signal processing energy parameter hidden Markov model（HMM） line spectral frequency（LSF） parametric coding

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献12

1Kondoz A M. Digital Speech: Coding for Low Bit Rate Communication Systems [M]. Chichester, UK: John Wiley Sons, 2004.
2Engan K, Aase S O, Husoy J H. Multi-frame compression: Theory and design [J]. EURASIP Signal Process, 2000, 80(10) : 2121 - 2140.
3Eriksson T, Linden J, Skoglund J. Interframe LSF quantization for noisy channels [J]. IEEE Trans Speech and Audio Processing, 1999, 7(5) : 495 - 509.
4Hagen R, Hedelin P. Low bit-rate spectral coding in CELP, a new LSP method [C]// Proc Int Conf Acoustic, Speech, Signal Processing (ICASSP). Albuquerque, NW, USA, 1990: 189-192.
5Soong F, Juang B. Optimal quantization of LSP parameters using delayed decisions [C]// Proc lnt Conf Acoustic, Speech, Signal Processing (ICASSP). Albuquerque, NM, USA, 1990: 185-188.
6ZHAO Ming, TANG Kun, CUI Huijuan. Mode-based quantization of LP parameters for very low bit rate vocoder [C]// International Conf on Communications, Circuits and Systems and West Sino Expositions. Chengdu, China, 2002: 28 - 31.
7Tokuda K, Masuko T, Hiroi J, et al. A very low bit rate speech coder using HMM-based speech recognition synthesis techniques [C]// Proc Int Conf Acoustic, Speech, Signal Processing (ICASSP). Seattle, WA, USA, 1998: 609-612.
8Hoshiya T, Sako S, Zen H, et al. Improving the performance of HMM-based very low bit rate speech coding [C]// Proc Int Conf Acoustic, Speech, Signal Processing (ICASSP). Hong Kong, China, 2003: 800- 803.
9魏旋.参数相关超低速率语音编码算法[D].北京:清华大学,2010.
10Rabiner L R, Juang B H. Fundamentals of Speech Recognition [M]. Upper Saddler River, NJ, USA: Prentice Hall, 1993.

1计哲,徐敬德,常亮,崔慧娟,唐昆.基于Gauss混合模型的清浊音恢复改进算法[J].清华大学学报（自然科学版）,2011,51(11):1751-1755. 被引量：1
2李晔,彭坦,许明,计哲,崔慧娟,唐昆.带有帧间级间预测的线谱频率参数多级矢量量化[J].清华大学学报（自然科学版）,2009(7):981-983. 被引量：9
3“触手司及”的5X悬疑视频揭秘OPPO拍照“黑科技”[J].新潮电子,2017,0(3):9-9.
4IP电话的通话质量评价[J].通信工程,2004(2):49-49.
5孟飚,张雪英.低速率CS-ACELP语音编码算法研究及实现[J].太原理工大学学报,2003,34(5):525-528. 被引量：2
6朱益厅,李永明,陈弘毅.一种多带清浊音判决方法[J].微电子学与计算机,1999,16(5):1-4. 被引量：3
7Chen Dianyong.Hidden Bottom Decay of γ （5S） and γ （4S）[J].IMP & HIRFL Annual Report,2011(1):11-13.
8鲍长春,樊昌信,王都生.线谱频率参数的分裂矢量量化[J].电子科学学刊,1998,20(4):508-514. 被引量：4
9赵永刚,唐昆,崔慧娟.预测自适应Gauss混合模型线谱频率的量化[J].清华大学学报（自然科学版）,2007,47(4):530-533.
10计哲,李晔,崔慧娟,唐昆.SELP声码器基音周期参数量化合成改进算法[J].高技术通讯,2010,20(1):45-48. 被引量：3

清华大学学报（自然科学版）

2013年第6期

浏览历史

内容加载中请稍等...

能量参数解码端HMM估计算法

参考文献12

相关作者

相关机构

相关主题

浏览历史