The author designs a new speech codec in this paper, which is based on ANN tocarry out nonlinear prediction . This new codec synthesizes speeches with better quality than theconventional waveform or hybrid codecs does...The author designs a new speech codec in this paper, which is based on ANN tocarry out nonlinear prediction . This new codec synthesizes speeches with better quality than theconventional waveform or hybrid codecs does at the same bit rate. Moreover, the most importantcharacteristic of this codec is the low coding delay, which will benefit the enhancement of thespeech communication QoS when we transmit speech signals in IP or ATM networks.展开更多
利用混合激励线性预测(mixed excitation linear prediction,MELP)算法和码激励线性预测(code excitation linear prediction,CELP)算法的优点,提出了一种混合MELP/CELP语音编码模型。编码端对强浊音帧采用MELP编码,对弱浊音帧和清音帧...利用混合激励线性预测(mixed excitation linear prediction,MELP)算法和码激励线性预测(code excitation linear prediction,CELP)算法的优点,提出了一种混合MELP/CELP语音编码模型。编码端对强浊音帧采用MELP编码,对弱浊音帧和清音帧进行CELP编码。MELP编码器采用相位对齐技术提取强浊音帧的相位参数,解决了合成语音与原始语音在时间上不同步的问题。对实现的4 kbit/s混合MELP/CELP声码器进行客观MOS(mean opinion score)值和主观DRT(diagnostic rhythm test)清晰度测试,结果表明,该声码器的合成语音具有较高的可懂度和清晰度。展开更多
为了提高深度模型的编码重构性能,本文为传统对比散度(Contrastive divergence,CD)添加了基于交叉熵的重构误差约束。利用改进后的算法训练了重构性深度自编码机(Reconstructive deep auto-encoder,RDAE),并用RDAE替换混合激励线性预测...为了提高深度模型的编码重构性能,本文为传统对比散度(Contrastive divergence,CD)添加了基于交叉熵的重构误差约束。利用改进后的算法训练了重构性深度自编码机(Reconstructive deep auto-encoder,RDAE),并用RDAE替换混合激励线性预测编码(Mixed excitation linear prediction,MELP)语音编码器中LSF参数的矢量量化方法。测试结果表明,改进后的算法在损失一定模型似然度的条件下获得了重构性能的提升,当RDAE隐藏层结点设为19bit时,本文方法所测得的加权LSF距离、重构语音质量、谱失真指标在训练集和测试集上均优于25bit矢量量化方法,即利用本文方法改进的MELP编码器,在不降低语音质量的条件下,可将MELP编码速率从2.4kb/s降低至2.1kb/s,编码速率降低了12.5%。展开更多
文摘The author designs a new speech codec in this paper, which is based on ANN tocarry out nonlinear prediction . This new codec synthesizes speeches with better quality than theconventional waveform or hybrid codecs does at the same bit rate. Moreover, the most importantcharacteristic of this codec is the low coding delay, which will benefit the enhancement of thespeech communication QoS when we transmit speech signals in IP or ATM networks.
文摘为了提高深度模型的编码重构性能,本文为传统对比散度(Contrastive divergence,CD)添加了基于交叉熵的重构误差约束。利用改进后的算法训练了重构性深度自编码机(Reconstructive deep auto-encoder,RDAE),并用RDAE替换混合激励线性预测编码(Mixed excitation linear prediction,MELP)语音编码器中LSF参数的矢量量化方法。测试结果表明,改进后的算法在损失一定模型似然度的条件下获得了重构性能的提升,当RDAE隐藏层结点设为19bit时,本文方法所测得的加权LSF距离、重构语音质量、谱失真指标在训练集和测试集上均优于25bit矢量量化方法,即利用本文方法改进的MELP编码器,在不降低语音质量的条件下,可将MELP编码速率从2.4kb/s降低至2.1kb/s,编码速率降低了12.5%。