摘要
数字语音在当今应用非常广泛,大量的语音流产生了巨大的网络带宽和服务器存储空间的消耗.因此,在保持听觉效果基本不受影响的前提下,对语音进行有损压缩,降低其比特率是非常重要的.针对压缩语音的共振峰包络提出了一种新颖的在线字典学习方法.不同于一般的线性方法,该方法通过对字典中的原子进行频移,使其能更好地进行共振峰拟合.通过使用希尔伯特变换,能快速并精确地确定最优频移量.实验结果表明,在还原近似度下限为99.5%的前提下,经过该方法压缩后,比特数比原包络平均减少了99%.因此,该方法能适用于对传输带宽或存储空间有严格要求的场合,同时保证解压后的语音听觉比较自然.
Digital voice is widely used nowadays.The uncountable and continuous generating voice streams consume huge amount of network bandwidth and hard disk space.Hence,under the pre-requisition of keeping the perceptual quality,it is important to compress these voice signals to achieve minimum bit rates.Therefore,a novel lossy compression algorithm for speech formant envelopes based on online dictionary learning method is proposed.By utilizing Hilbert transform,the proposed method efficiently shifts the atoms of the dictionary over the frequency for better fitting the formant envelopes.Experimental results show that when the minimum reconstruction quality threshold is 99.5%,in comparison of the uncompressed envelope data,the proposed method achieves 99%bit rate reduction on average.Hence,this method is applicable to scenarios that has limited band width and storage capacity,and meanwhile can keep good perceptual quality.
作者
霍颖翔
滕少华
HUO Yingxiang;TENG Shaohua(School of Computer Science and Technology,Guangdong University of Technology,Guangzhou Guangdong 510006,China)
出处
《江西师范大学学报(自然科学版)》
CAS
北大核心
2019年第4期394-401,共8页
Journal of Jiangxi Normal University(Natural Science Edition)
基金
国家自然科学基金(61772141,61402118,61673123,61603100,61702110)
广东省科技计划(2016B010108007)
广东省教育厅(粤教高函[2018]179号,粤教高函[2018]1号,粤教高函[2015]113号,粤教高函[2014]97号)
广州市科技计划(201802030011,201802010026,201802010042,201604020145,201604046017)资助项目
关键词
语音流
有损压缩
在线字典学习
voice stream
lossy compression
online dictionary learning