摘要
本文针对波形内插(WI)语音编码模型和参数量化等技术进行了研究,并最终提出了一种基于二维非负矩阵分解的1kb/s波形内插(2DNMF-WI)语音编码算法.文中采用二维非负矩阵分解(2D-NMF)方法来分解语音特征波形(CW),该分解方法在行和列两个方向上同时压缩CW幅度谱矩阵的维数,使得CW幅度谱矩阵降维后得到的编码矩阵维数较小,易于量化.此外,在甚低速率语音编码中,由于没有足够的比特数来描述编码参数,往往很难得到高质量的合成语音.本算法采用两帧联合编码、帧间后向预测三级矢量量化、离散余弦变换(DCT)和分裂式矩阵量化等技术来降低编码速率和改善音质.非正式主观听觉测试显示,1kb/s 2DNMF-WI编码器合成语音的质量稍差于2kb/s的NMF-WI语音编码算法.
This paper is focused on the model of waveform interpolation(WI) and its parameters quantization,then a waveform interpolation speech coding algorithm based on two-dimensional nonnegative matrix factorization at 1kb/s is presented.This method makes the dimensions of CW magnitude matrix much lower in columns and rows,so it is convenient for quantizing the coding matrix.In addition,speech coders at very low bit rates can hardly get good performance,for there are no sufficient bits to express these coding parameters.Then two-frame joint,inter-frame backward prediction three-stage vector quantization,discrete cosine transform(DCT) and split matrix quantization techniques are promoted in this paper,in order to reduce the speech coding bit rates as well as to improve the quality of the speech.The results of informal subjective listening test show that the performance of 1kb/s 2DNMF-WI coder is a little worse than that of 2kb/s NMF-WI coder.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2010年第7期1574-1579,共6页
Acta Electronica Sinica
基金
北京市教委科技发展计划(No.KM200710005001)
国家自然科学基金(No.60372063)
北京市自然科学基金(No.4042009)
北京市属高校人才强教计划
关键词
语音编码
波形内插
特征波形
二维非负矩阵分解
两帧联合
speech coding
waveform interpolation
characteristic waveform
two-dimensional nonnegative matrix factorization
two-frame joint