期刊文献+

一种基于非均匀谱系数和GMM的语音质量评估方法 被引量:2

Output-based speech quality evaluation based on Non-uniform Linear Prediction Cepstrum and Gaussian Mixture Models
下载PDF
导出
摘要 本文提出了一种新的基于GMM和非均匀线性预测倒谱系数(NLPC)的客观语音质量评估方法。首先,通过Bark双线性变换(BBT)对线性频谱进行频谱弯折,弯折后的频谱符合人耳听觉感知的非均匀特性。然后通过对非均匀谱的线性预测计算出NLPC。提取参考语音的NLPC用来对高斯混合模型进行训练。通过训练对参考语音建立参考模型。由参考模型和失真语音的NLPC向量可以得到它们之间的一致性测度。最后,通过多元自适应回归样条函数建立主观MOS分和一致性测度之间的映射关系,可以得到对MOS分的客观预测模型。通过这一模型进行语音质量的客观评价。实验表明,提出算法的性能要好于ITU-T P.563标准中的算法。 A novel approach for output-based speech quality evaluation is proposed based on Non-uniform Linear Prediction Cepstrum(NLPC) and Gaussian Mixture Models(GMMs).Bark Bilinear Transform(BBT) is employed for spectrum warping that incorporates the non-uniform resolution properties of the human ear.Then,the algorithm computes NLPC coefficients from warped spectrum.GMMs are used to form a reference model of normative behavior by training on features extracted from clean speech signals.A measure of consistency between the degraded speech coefficient vector and reference model serves as indicators of speech quality evaluation.Finally,using a Multivariate Adaptive Regression Splines(MARS) function,an objective forecast model is constructed to accomplish the mapping from the subjective Mean Opinion Score(MOS) to the consistency measure.The experimental results indicate that the performance of proposed approach is better than that of ITU-T P.563 standard.
出处 《电路与系统学报》 CSCD 北大核心 2010年第4期104-109,90,共7页 Journal of Circuits and Systems
关键词 语音质量 客观评价 非均匀线性预测倒谱系数 高斯混合模型 多元自适应回归样条 speech quality objective speech quality evaluation non-uniform linear prediction cepstrum coefficient Gaussian mixture model multivariate adaptive regression splines
  • 相关文献

参考文献10

  • 1C Jin, R Kubichek. Vector quantization techniques for output-based objective speech quality [A]. in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing [C]. 1996. 491-494.
  • 2鄢田云,云霞,靳蕃,朱庆军.RBF神经网络及其在基于输出的客观音质评价中的应用[J].电子学报,2004,32(8):1282-1285. 被引量:7
  • 3D S Kim. ANIQUE: An auditory model for single-ended speech quality estimation [J]. IEEE Trans. Speech Audio Process., 2005, 13(5): 821-831.
  • 4ITU-T Rec. G.729-Annex B, A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to Recommendation V.70 [S]. Int. Telecommun. Union, Geneva, Switzerland, 1996-11.
  • 5ITU-T P.563, Single Ended Method for Objective Speech Quality Assessment in Narrow-Band Telephony Applications [S]. Int. Telecommun. Union, Geneva, Switzerland, 2004-05.
  • 6J.O.Smith, J.S.Abel. Bark and ERB Bilinear Transform [J]. IEEE Transactions on Speech and Audio Processing, 1999, 7(6): 697-708.
  • 7A Dempster, N Lair, D Rubin. Maximum likelihood from incomplete data via the EM algorithm [A]. ,I. R. Stat. Soc. [C]. 1977, 39: 1-38.
  • 8J H Friedman. Multivariate adaptive regression splines [J]. Ann. Statist., 1991, 19(1): 1-141.
  • 9ITU-T Rec. P. Supplement 23, ITU-T Coded-Speech Database [S]. Int. Telecommun. Union, Geneva, Switzerland, 1998-02.
  • 10黄惠明,王瑛,赵思伟,张知易.语音系统客观音质评价研究[J].电子学报,2000,28(4):112-114. 被引量:27

二级参考文献13

共引文献31

同被引文献13

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部