期刊文献+

复杂环境下基于准干净语音的音质评价方法 被引量:4

Quasi-clean speech-based speech quality evaluation method under complex environments
原文传递
导出
摘要 提出一种新的复杂环境噪声下无参考源语音质量客观评价方法.该方法基于准干净语音构造和有参考源评价模型,实现接近于有参考源客观评价的性能.首先,采用改进的最小值控制递归平均算法和多带谱减法获得带噪语音的准干净语音;然后,将该准干净语音作为改进的主观语音质量评估(PESQ)算法的参考语音,计算参考语音与带噪语音之间的失真误差,获得带噪语音的客观评价分值.实验结果表明:该算法客观评分相关度达到0.927,与有参考源语音质量客观评价标准的相关度0.931相比,实现了99%的接近,与无参考源的客观评价标准相比,提高了7.4%. A new non-intrusive evaluation method for complex environments noise was proposed.The method was based on quasi-clean speech construction and intrusive perceptual assessment,and a similar performance to that of intrusive evaluation was achieved.Firstly,an improved minima controlled recursive averaging and a multi-band spectral subtraction algorithm were used to obtain the quasi-clean speech from the noisy speech.Then the quasi-clean speech was regarded as the reference speech to a modified version of perceptual evaluation of speech quality(PESQ).The distortion between the quasi clean speech and the noisy speech was measured by the PESQ model,and the mean opinion score(MOS)was attained.Experimental results demonstrate that the proposed method gets a objective score correlation of 0.927,which is 99%similar to 0.931 of Reference source speech quality objective evaluation criteria,and 7.4% superior to objective evaluation criteria without reference source.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2016年第7期121-126,共6页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(61571192) 广东省公益基金资助项目(2015A010103003)
关键词 语音质量 客观评价 无参考源 复杂环境 准干净语音构造 speech quality objective evaluation non-intrusive complex environments quasi-clean speech construction
  • 相关文献

参考文献18

  • 1Gierlich H, Heute U, Moeller S. Advances in perceptual modeling of speech quality in telecommunications[C] // Proc of the 11th ITG Symposium on Speech Communication. Erlangen: VDE, 2014: 1-4.
  • 2Hu Y, Loizou P C. Subjective evaluation and comparison of speech enhancement algorithms[J]. Speech Communication, 2007, 49: 588-601.
  • 3王晶,谢湘,李婧欣,高麟鹏.音频质量评价标准研究[J].信息技术与标准化,2014(3):39-42. 被引量:6
  • 4Loizou P C.语音增强-理论与实践[M].高毅,肖莉,等译.四川:电子科技大学出版社,2012.
  • 5谭晓衡,许可,秦基伟.基于听觉感知特性的语音质量客观评价方法[J].西南交通大学学报,2013,48(4):756-760. 被引量:6
  • 6ITU-T. P. 862-01 Perceptual evaluation of speech quality (PESQ): an objective method for end to end speech quality assessment of narrow-band telephone networks and speech codecs[S]. Geneva: ITU-T, 2001.
  • 7张伟伟,常永宇,刘奕彤,杨大成.中文环境下PESQ评价语音编解码器的性能[J].北京邮电大学学报,2014,37(3):115-119. 被引量:3
  • 8Ludovic M, Jens B, Martin K. P. 563-the ITU-T standard for single-ended speech quality assessment [J].IEEE Transactions on Audio, Speech and Language Processing, 2006, 14(6): 1924-1934.
  • 9Falk T H, Cosentino S, Santos J, et al. Non-intrusive objective speech quality and intelligibility prediction for hearing instruments in complex listening environments[C] //Proc of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Vancouver: IEEE, 2013: 7820- 7824.
  • 10Manish N, Lin Weisi, Ian V M, et al. Non-intrusive speech quality assessment with support vector regression[J].Advances in Multimedia Modeling, 2010, 59: 325-335.

二级参考文献25

  • 1陈华伟,靳蕃.基于感知模型的美尔谱失真测度[J].西南交通大学学报,2006,41(6):723-728. 被引量:4
  • 2张军,张德运,傅鹏.一种改进的心理声学语音质量客观评价算法[J].微电子学与计算机,2007,24(3):203-206. 被引量:6
  • 3Telecommunication Standardization Sector of ITU. ITU- T Recommendation P. 830 Subjective performance assessment of telephone-band and wideband digital codecs[ S]. Geneva: International Telecommunication Union, 1996.
  • 4Telecommunication Standardization Sector of ITU. ITU- T Recommendation P. 862 Perceptual evaluation of speech quality (PESQ) : An objective method for end- to-end speech quality assessment of narrow-band telephone networks and speech codecs[ S]. Geneva: International Telecommunication Union, 2001.
  • 5KUBICHEK R. Mel-cepstral distance measure for objective speech quality assessment[ C]//Proceedings of IEEE Pacific Rim Conference on Communications, Computer and Signal Processing. Piscataway: IEEE Press, 1993: 125-128.
  • 6DAVIS S B, MERMELSTEIN P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences[ J ]. IEEE Trans. on Acoustics, Speech and Signal Processing, 1980, 28(4) : 357-366.
  • 7JOHANNESMA P I M. The pre-response stimulus ensemble of neurons in the cochlear nucleus [ C ] //// Proceedings of the Symposium on Hearing Theory. Eindhoven: IPO, 1972: 58-69.
  • 8Pohlmann KC.数字音频原理与应用[M].苏菲,译.第四版.北京:电子工业出版社,2002.
  • 9数字音视频编解码技术标准工作组.AVS技术创新报告(2002-2010)-第4部分:AVS音频编解码技术[M],北京:人民邮电出版社,2011..
  • 10ITU-T. Definition of quality of experience (QoE) [S]. International Telecommunication Union Ref: TD 109rev2 (PLEN/12), 2007.

共引文献12

同被引文献18

引证文献4

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部