基于匹配跟踪的感知梯度正弦建模方法被引量：6

A Sinusoidal Modeling Method Based on Matching-Pursuits with Perceptual Gradient

下载PDF

导出

摘要匹配跟踪作为一种自适应的信号分解算法,为语音和音频正弦建模提供了一个新的框架.分析了基于匹配跟踪的正弦建模过程以及感知加权匹配跟踪正弦建模算法,并在此基础上提出了感知梯度正弦建模方法.该方法结合匹配跟踪自适应的动态特征,利用心理声学模型计算当前合成信号的动态掩蔽阈值,以此为参考提取残差信号中感觉最明显的信号分量,从而最大限度地增加合成信号中的感知信息.在模型精度不高的情况下,该方法也能得到合成质量比较高的语音.实验表明,该方法更好地利用了人耳的听觉特性,建模结果更为合理、有效.客观的信噪比和主观试听测试都显示了所提出算法的合理性与优越性. As an adaptive algorithm of signal decomposition, matching pursuits provides a new framework for sinusoidal modeling of speech and audio signal. In this paper, the procedure of sinusoidal modeling using matching pursuits is analyzed as well as the sinusoidal modeling algorithm using perceptually weighted matching pursuits. And a method of sinusoidal modeling with perceptual gradient is proposed. The proposed method, which adopts the adaptive feature of matching pursuits, computes dynamically a masking threshold from the currently synthesized signal using the psychoacoustic model. With the threshold, it extracts the most perceptually significant component from the residual signal. Therefore, the perceptual information contained in the synthesized signal increases as quickly as possible. The quality of the synthesized speech by this approach is rather high even if the model precision is low. Experiments prove that the method in this paper uses the features of hearing system in a better way, and the modeling is reasonable and efficient. Both the objective compare of SNR and the subjective listening test show the rationality and superiority of the new method.

作者张文耀许刚王裕国

机构地区中国科学院软件研究所

出处《软件学报》 EI CSCD 北大核心 2003年第3期467-472,共6页 Journal of Software

关键词匹配跟踪正弦建模感知梯度心理声学模型语音信号处理 sinusoidal modeling matching pursuit perceptual gradient psychoacoustic model speech signal processing

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献12

1[1]McAulay RJ, Quatieri TF. Speech analysis-synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech and Signal Processing, 1986,34(4):744～754.
2[2]McAulay RJ, Quatieri TF. Sinusoidal coding. In: Kleijin WB, ed. Speech Coding and Synthesis. Netherlands: Elsevier Science B.V., 1995. 123～173.
3[3]George EB, Smith MJT. Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model. IEEE Transactions on Acoustics, Speech and Signal Processing, 1997,5(5):389～406.
4[4]Morgan DP, George EB, Lee LT, Kay SM. Co-Channel speaker separation by harmonic enhancement and suppression. IEEE Transactions on Acoustics, Speech and Signal Processing, 1997,5(5):407～425.
5[5]George EB, Smith MJT. Analysis-by-Bynthesis/Overlap-Add sinusoidal modeling applied to the analysis and synthesis of musical tones. Journal of the Audio Engineering Society, 1992,40(6):497～515.
6[6]Verma TS, Meng HY. Sinusoidal modeling using frame-based perceptually weighted matching pursuits. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. Piccataway, N.J.: IEEE, 1999. 981～984.
7[7]ISO/IEC JTC1/SC29/WG11 MPEG, IS11172-3. Information Technology--Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About 1.5 Mbits/s, Part3: Audio. 1992.
8[8]Painter T, Spanias A. Perceptual coding of digital audio signal. Proceedings of the IEEE, 2000,88(4):451～513.
9[9]Mallat S, Zhang Z. Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 1993,41(12): 3397～3415.
10[10]Jaggi S, Karl WC, Mallat S, Willsky AS. High resolution pursuit for feature extraction. Applied and Computational Harmonic Analysis, 1998,5(3):428～449.

同被引文献48

1马世伟,吴从毛,袁康.基于时间切变Gabor原子的时频建模[J].系统仿真学报,2006,18(z2):155-158. 被引量：2
2张家騄.听觉部位学说与频率差阈[J].声学学报,2006,31(2):97-100. 被引量：9
3赵玉娟,水鹏朗,张凌霜.基于子空间匹配追踪的信号稀疏逼近[J].信号处理,2006,22(4):501-505. 被引量：9
4Shirazi J, Ghaemmaghami S, Razzazi F. Improvements in audio classification based on sinusoidal modeling. IEEE international conference on multimedia and expo, Hannover, German, 2008:1485--1488.
5Ramamohan S, Dandapat S. Sinusoidal model-based analysis and classification of stressed speech. IEEE transactions on audio, speech, and language processing, 2006; 14(3): 737--746.
6Kihong Kim, Yongick Chung, Cheolyong Park. Speech quality enhancement based on sinusoidal model using Chehyshev filter. Future generation communication and networking, 2007; 1:323--327.
7Master A S. Sinusoidal modeling parameter estimation via a dynamic channel vocoder model. IEEE international conference on acoustics, speech, and signal processing, Orlando, USA, 2002; 2:1857--1860.
8George E B, Smith M J T. Speech analysis/synthesis and modification using an analysis by synthesis overlap-add sinusoidal model. IEEE transaction on speech and audio processing, 1997; 5(5): 389--406.
9McAulay R, Quatieri T. Speech analysis/synthesis based on a sinusoidal representation. IEEE transaction on acoustics, speech, and signal processing, 1986; 34(4): 744--754.
10Mallat S G. ZHANG Zhifeng. Matching pursuits with time-frequency dictionaries. IEEE transaction on signal processing, 1993; 41(12): 3397--3415.

引证文献6

1武明勤,于凤芹,韩鹍.一种基于Chirp原子分解的语音增强方法[J].微电子学与计算机,2005,22(12):74-77.
2王晶,赵胜辉,匡镜明.一种基于匹配跟踪的谐波和独立谱线正弦模型实现方案[J].电子与信息学报,2006,28(6):1016-1020.
3杨萃,韦岗.基于窄带谱能量的快速正弦分析方法[J].声学学报,2009,34(5):462-470. 被引量：2
4YANG Cui WEI Gang.Fast sinusoidal analysis algorithm based on energy of narrowband spectrum[J].Chinese Journal of Acoustics,2010,29(4):413-427.
5李晓明,鲍长春.采用经验模态分解的语音与音频通用编码方法[J].信号处理,2013,29(10):1274-1282.
6王斌斌,于凤芹.基于指数正弦原子的汉语韵母的建模方法[J].计算机工程与应用,2014,50(4):223-226.

二级引证文献2

1杨萃.基于ESPRIT的噪声抑制频率估计算法[J].计算机工程,2010,36(14):246-248. 被引量：2
2梁瑞宇,邹采荣,赵力,王青云,奚吉.汉语数字助听器高频听损增强方法的实验研究[J].声学学报,2012,37(5):527-533. 被引量：1

1张勇,李国峰,鲁毅,梁科,王锦.基于分布式算法的声像定位[J].南开大学学报（自然科学版）,2010,43(4):19-24.
2郑映春,Jack.声声有色——声力霸汽车音响测试报告[J].音响改装技术,2009(8):140-141.
3杨怿菲.基于空间信息的彩色逆半调图像评价方法[J].计算机应用,2009,29(6):1699-1701. 被引量：2
4就是“我爱你”——三星YP-520 MP3播放器[J].电脑爱好者,2004(9):104-104.
5何建昭,梁晓诚.基于心理声学模型的盲音频数字水印改进型算法[J].计算机安全,2008(8):8-12. 被引量：1
6田光明,陈光.基于Wigner分布时频遮隔的信号分解算法[J].电子学报,2008,36(1):95-99. 被引量：3
7王莉,胡剑凌,徐盛.基于听觉掩蔽效应的语音增强算法的研究[J].电声技术,2006,30(7):39-42. 被引量：3
8付兵.基于MP3的信息隐藏[J].石油天然气学报,2003,25(z1):142-143.
9朱宏,蒋刚毅,王晓东,陈芬,郁梅,邵枫,彭宗举.一种基于人眼视觉特性的视频质量评价算法[J].计算机辅助设计与图形学学报,2014,26(5):776-781. 被引量：9
10周夕良.语音情感识别的发展与展望[J].信息技术,2013,37(11):19-22. 被引量：1

软件学报

2003年第3期

浏览历史

内容加载中请稍等...

基于匹配跟踪的感知梯度正弦建模方法被引量：6

参考文献12

同被引文献48

引证文献6

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于匹配跟踪的感知梯度正弦建模方法 被引量：6

参考文献12

同被引文献48

引证文献6

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于匹配跟踪的感知梯度正弦建模方法被引量：6