期刊文献+

基于匹配跟踪的感知梯度正弦建模方法 被引量:6

A Sinusoidal Modeling Method Based on Matching-Pursuits with Perceptual Gradient
下载PDF
导出
摘要 匹配跟踪作为一种自适应的信号分解算法,为语音和音频正弦建模提供了一个新的框架.分析了基于匹配跟踪的正弦建模过程以及感知加权匹配跟踪正弦建模算法,并在此基础上提出了感知梯度正弦建模方法.该方法结合匹配跟踪自适应的动态特征,利用心理声学模型计算当前合成信号的动态掩蔽阈值,以此为参考提取残差信号中感觉最明显的信号分量,从而最大限度地增加合成信号中的感知信息.在模型精度不高的情况下,该方法也能得到合成质量比较高的语音.实验表明,该方法更好地利用了人耳的听觉特性,建模结果更为合理、有效.客观的信噪比和主观试听测试都显示了所提出算法的合理性与优越性. As an adaptive algorithm of signal decomposition, matching pursuits provides a new framework for sinusoidal modeling of speech and audio signal. In this paper, the procedure of sinusoidal modeling using matching pursuits is analyzed as well as the sinusoidal modeling algorithm using perceptually weighted matching pursuits. And a method of sinusoidal modeling with perceptual gradient is proposed. The proposed method, which adopts the adaptive feature of matching pursuits, computes dynamically a masking threshold from the currently synthesized signal using the psychoacoustic model. With the threshold, it extracts the most perceptually significant component from the residual signal. Therefore, the perceptual information contained in the synthesized signal increases as quickly as possible. The quality of the synthesized speech by this approach is rather high even if the model precision is low. Experiments prove that the method in this paper uses the features of hearing system in a better way, and the modeling is reasonable and efficient. Both the objective compare of SNR and the subjective listening test show the rationality and superiority of the new method.
出处 《软件学报》 EI CSCD 北大核心 2003年第3期467-472,共6页 Journal of Software
关键词 匹配跟踪 正弦建模 感知梯度 心理声学模型 语音信号处理 sinusoidal modeling matching pursuit perceptual gradient psychoacoustic model speech signal processing
  • 相关文献

参考文献12

  • 1[1]McAulay RJ, Quatieri TF. Speech analysis-synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech and Signal Processing, 1986,34(4):744~754.
  • 2[2]McAulay RJ, Quatieri TF. Sinusoidal coding. In: Kleijin WB, ed. Speech Coding and Synthesis. Netherlands: Elsevier Science B.V., 1995. 123~173.
  • 3[3]George EB, Smith MJT. Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model. IEEE Transactions on Acoustics, Speech and Signal Processing, 1997,5(5):389~406.
  • 4[4]Morgan DP, George EB, Lee LT, Kay SM. Co-Channel speaker separation by harmonic enhancement and suppression. IEEE Transactions on Acoustics, Speech and Signal Processing, 1997,5(5):407~425.
  • 5[5]George EB, Smith MJT. Analysis-by-Bynthesis/Overlap-Add sinusoidal modeling applied to the analysis and synthesis of musical tones. Journal of the Audio Engineering Society, 1992,40(6):497~515.
  • 6[6]Verma TS, Meng HY. Sinusoidal modeling using frame-based perceptually weighted matching pursuits. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. Piccataway, N.J.: IEEE, 1999. 981~984.
  • 7[7]ISO/IEC JTC1/SC29/WG11 MPEG, IS11172-3. Information Technology--Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About 1.5 Mbits/s, Part3: Audio. 1992.
  • 8[8]Painter T, Spanias A. Perceptual coding of digital audio signal. Proceedings of the IEEE, 2000,88(4):451~513.
  • 9[9]Mallat S, Zhang Z. Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 1993,41(12): 3397~3415.
  • 10[10]Jaggi S, Karl WC, Mallat S, Willsky AS. High resolution pursuit for feature extraction. Applied and Computational Harmonic Analysis, 1998,5(3):428~449.

同被引文献48

  • 1马世伟,吴从毛,袁康.基于时间切变Gabor原子的时频建模[J].系统仿真学报,2006,18(z2):155-158. 被引量:2
  • 2张家騄.听觉部位学说与频率差阈[J].声学学报,2006,31(2):97-100. 被引量:9
  • 3赵玉娟,水鹏朗,张凌霜.基于子空间匹配追踪的信号稀疏逼近[J].信号处理,2006,22(4):501-505. 被引量:9
  • 4Shirazi J, Ghaemmaghami S, Razzazi F. Improvements in audio classification based on sinusoidal modeling. IEEE international conference on multimedia and expo, Hannover, German, 2008:1485--1488.
  • 5Ramamohan S, Dandapat S. Sinusoidal model-based analysis and classification of stressed speech. IEEE transactions on audio, speech, and language processing, 2006; 14(3): 737--746.
  • 6Kihong Kim, Yongick Chung, Cheolyong Park. Speech quality enhancement based on sinusoidal model using Chehyshev filter. Future generation communication and networking, 2007; 1:323--327.
  • 7Master A S. Sinusoidal modeling parameter estimation via a dynamic channel vocoder model. IEEE international conference on acoustics, speech, and signal processing, Orlando, USA, 2002; 2:1857--1860.
  • 8George E B, Smith M J T. Speech analysis/synthesis and modification using an analysis by synthesis overlap-add sinusoidal model. IEEE transaction on speech and audio processing, 1997; 5(5): 389--406.
  • 9McAulay R, Quatieri T. Speech analysis/synthesis based on a sinusoidal representation. IEEE transaction on acoustics, speech, and signal processing, 1986; 34(4): 744--754.
  • 10Mallat S G. ZHANG Zhifeng. Matching pursuits with time-frequency dictionaries. IEEE transaction on signal processing, 1993; 41(12): 3397--3415.

引证文献6

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部