基于深度神经网络的说话人信道自适应方法

Channel Adaptation Based on Deep Neural Networks for Speaker Verification

下载PDF

导出

摘要针对说话人确认中的复杂信道环境干扰问题,提出一种基于深度神经网络的信道自适应方法。该方法首先在不同信道类型下训练得到音素信息相关的深度神经网络模型(deep neural networks,DNNs),将说话人语音的声学特征参数在这些DNNs上进行自适应,得到各信道类型下的深瓶颈特征(deep bottleneck feature,DBF)。然后将这些参数进行拼接并通过PCA降维,最后采用目前最有效的基于身份认证矢量(identity vector,i-vector)的建模技术对降维后的DBF进行建模,得到目标说话人模型和测试语音段的i-vector矢量用于最终说话人确认打分判决。在NIST SRE2010核心评测数据库上的实验结果表明,利用提出的方法能有效消除信道干扰对说话人确认的影响,在很大程度上提升了基于i-vector的说话人确认基线系统的性能。 In order to handle the channel condition distortions between train and test speech in speaker verification,based on the deep neural networks,a channel adaptation approach was proposed. First,several phonetic deep neural networks（ DNNs） were trained on the speech datasets with different types of channel conditions. The acoustic features derived from speaker utterances were then adapted to obtain deep bottleneck features（ DBFs） using these DNNs. DBFs were concatenated and a feature dimension reduction was performed using PCA. Finally,these DBFs were modeled by the identity vector（ i-vector） modeling technique which is the most popular and efficient approach for speaker verification. The achieved i-vectors for target speaker and test utterances were then used to achieve the final verification scores. Results on the NIST SRE2010 coretest evaluation task demonstrated that compared to the i-vector baseline system,the proposed approach is effective to eliminate channel distortions for speaker verification,and achieves significant performance improvements.

作者龙艳花倪继锋叶宏

机构地区上海师范大学电气信息系

出处《四川大学学报（工程科学版）》 EI CAS CSCD 北大核心 2016年第2期151-155,共5页 Journal of Sichuan University (Engineering Science Edition)

基金上海市青年科技英才扬帆计划资助项目(14YF1409300)

关键词信道自适应深度神经网络深瓶颈特征 i-vector 说话人确认 channel adaptation deep neural network deep bottleneck feature i-vector speaker verification

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1温建伟,戴琼海,金以慧.基于PID的信道自适应跨层优化无线视频传输[J].清华大学学报（自然科学版）,2007,47(10):1578-1580. 被引量：1
2哈尔肯别克.木哈西,钟珞,达瓦.伊德木草.用说话人相似度i-vector的非负值矩阵分解说话人聚类[J].计算机应用与软件,2017,34(4):165-168.
3赵蕤,王作英.用于语音识别的基于频谱调整的信道自适应方法[J].清华大学学报（自然科学版）,2005,45(4):441-444. 被引量：2
4黄光许,田垚,康健,刘加,夏善红.低资源条件下基于i-vector特征的LSTM递归神经网络语音识别系统[J].计算机应用研究,2017,34(2):392-396. 被引量：21
5王伟,韩纪庆,郑铁然,郑贵滨,陶耀.基于Fisher判别字典学习的说话人识别[J].电子与信息学报,2016,38(2):367-372. 被引量：6
6习勇,魏急波,庄钊文.差错信道下IEEE802.11DCF最优帧长分析及信道自适应策略[J].通信学报,2006,27(5):84-89. 被引量：8
7王坚,张建州,贺忝.信道自适应主动块低频分量丢弃数字图像传输技术[J].计算机工程与应用,2004,40(21):60-62. 被引量：1
8谭萍,邢玉娟,高翔.说话人模型聚类算法研究与分析[J].中国建材科技,2015,24(5):87-88.
9琚炜,李锐,李辉.使用置信区间的基频特征对Ⅰ-Vector系统的性能补偿[J].小型微型计算机系统,2016,37(7):1629-1632.
10栗志意,何亮,张卫强,刘加.基于鉴别性i-vector局部距离保持映射的说话人识别[J].清华大学学报（自然科学版）,2012,52(5):598-601. 被引量：11

四川大学学报（工程科学版）

2016年第2期

浏览历史

内容加载中请稍等...

基于深度神经网络的说话人信道自适应方法

相关作者

相关机构

相关主题

浏览历史