基于频域线性预测心理声学掩蔽模型的音频编解码器

Audio Codec Using Frequency Masking Based on Frequency Domain Linear Prediction

下载PDF

导出

摘要频域线性预测给出了信号的希尔伯特包络的近似。基于频域线性预测的编解码器运用长时分割,很好地保持了时域包络信息。该编解码器能够重建高质量的信号,但是编码效率不高。将频域掩蔽引入到时域线性预测编解码器用以减少比特率。频域掩蔽是一个听音现象,如果另一个强度较大的声音出现,关注声音的听音阈值将增加。心理声学模型用于估计频域线性预测载波信号的听力阈值和绝对听力阈值。频域子带频域线性预测载波信号的比特配置根据听力阈值和绝对听力阈值计算得到。应用频率掩蔽,比特率下降5%。该文方法的效果应用音频质量感知评价和MUSHRA方法进行了测试。 Frequency Domain Linear Prediction （FDLP） gives an approximation of the Hilbert envelopes of a signal,which has been proved to contain most of the speech information.FDLP based Codec works with long temporal segments and keeps the information carried by the time-domain envelopes very wel .The codec gives good quality of the reconstructed signal,but is not efficient enough.This paper introduces Frequency masking to FDLP based codec to reduce the bit-rate.Frequency masking is a hearing phenomenon that the hearing threshold of a sound wil increase if an intense sound exists simultane-ously.The psychoacoustics model is used to estimate the hearing threshold and the absolute threshold of hearing （ATH） of the FDLP carrier signals,and bit al ocation for frequency sub-bands FDLP carrier signal is calculated according to the threshold and ATH.6% bit-rate reduction is obtained with the application of the frequency masking.

作者章佩王松董石姜林

机构地区武汉大学深圳研究院武汉大学东华理工大学软件学院

出处《工业控制计算机》 2014年第6期75-77,共3页 Industrial Control Computer

基金深圳市生物互联网新能源新材料产业发展专项资金基础研究计划"基于AVS-P10技术的移动多媒体系统关键技术研究"(JC201104220203A)

关键词心理声学模型频域掩蔽音频编码频域线性预测 psychoacoustics mode,frequency masking,audio coding,Frequency Domain Linear Prediction （FDLP）

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献12

1Z.M Smith,B. Delgutte and A.J. Oxenham, "Chimaeric sounds reveal dichotomies in auditory perception", Nature, 416(6876): 87-90, 2002.
2March 7 P. Motlicek, H. Hermansky, S. Ganapathy, H. Garudadri, " Non-Uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes", Proceedings of TSD,LNCS/LNAI series,Springer-Verlag,Berlin, pp. 350-357, September 2007.
3S. Ganapathy, P. Motlicek, H. Hermansky, H. Garudadri, " Autoregressive Modelling of Hilbert Envelopes for Wide-band Audio Coding", Audio Engineering Society, 124th Convention, Amsterdam, Nethertands. May 2008.
4IS. Ganapathy, P. Motlicek, H. Hermansky, H. Garudadri, " Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction," ICASSP, 2008, pp.4781-4784, March 31 2008-April 4 2008.
5T. Painter and A. Spanias, "Perceptual coding of digital au- dio," Proceedings of the IEEE, vo1.88, no.4, pp.451-515, Apr 2000.
6M. Schroeder, B. S. Atal, and J. L. Hall, "Optimizing digital speech coders by exploiting masking properties of the hu- man ear," J. Acoust.Soc. Amer., pp. 1647-1652, Dec. 1979.
7E. Zwicker and H. Fastl, Psychoacoustics Facts and Models. Berlin, Germany: Springer-Verlag, 1990.
8P. Motlicek, S. Ganapathy, H. Hermansky, H. Garudadri, "De- composition for Wide-band Audio Coding based on Fre- quency Domain Linear Prediction," Tech. Rep., IDIAP, RR 07-43, October 2007.
9ITU-R Recommendation BS.1387, "Method for objective psychoacoustic model based on PEAQ to perceptual audio measurements of perceived audio quality", December 1998.
10ITU-R Recommendation BS.1534: "Method for the subjec- tive assessment of intermediate audio quality", June 2001.

1万晓榆,何萝林.最佳比特分配的多级矢量量化块变换图象编码[J].数字通信,1993,20(1):54-65.
2汪石农.MPEG心理声学模型研究及其Matlab实现[J].安徽工程科技学院学报（自然科学版）,2009,24(4):32-34. 被引量：1
3王玥,李平,崔杰.听觉频域掩蔽效应的自适应β阶贝叶斯感知估计语音增强算法[J].声学学报,2013,38(4):501-508. 被引量：5
4彭浩辉,谢志文.掩蔽模型对语音增强效果影响的研究[J].电声技术,2008,32(9):56-60.
5杨秀坤.非线性编辑软件特技效果应用[J].科技传播,2012,4(1):160-160. 被引量：1
6莫仁.演播厅灯光的运用[J].数字传媒研究,2016,33(10):61-62.
7李波,郑成诗,赵剑,李晓东.基于听觉掩蔽效应的自适应反馈抵消[J].应用声学,2007,26(5):292-299.
8陈涛,王莹,刘勇,吴迪,刘鲁涛.基于频率响应屏蔽的窄过渡带信道化接收机[J].吉林大学学报（工学版）,2015,45(1):335-340. 被引量：10
9卢琳,葛朝清.实用的广播节目卫星与本地一体化防插播技术[J].电声技术,2013,37(7):81-83. 被引量：2
10王让定,徐达文,陈金儿.基于频率掩蔽效应的自适应音频数字水印技术[J].计算机工程与应用,2004,40(15):31-33. 被引量：6

工业控制计算机

2014年第6期

浏览历史

内容加载中请稍等...

基于频域线性预测心理声学掩蔽模型的音频编解码器

参考文献12

相关作者

相关机构

相关主题

浏览历史