期刊文献+

一种改进LSTM训练的语音分离技术

Speech separation technology for improving LSTM training
下载PDF
导出
摘要 采用长短时记忆网络进行语音分离,可以很好地利用语音信号的时序相关性,提升分离语音的可懂度,但同时带来计算复杂、训练耗时长等问题,且在语音感知评价提升方面效果不佳。针对此问题,使用参数更少的单元结构对模型进行优化,缩短训练时间;为进一步提升目标语音的语音质量和可懂度,结合自注意力机制对模型输入特征进行优化,抑制噪声主导时频单元对分离结果的影响。为了更好地匹配语音分离的各性能指标,提出与语音评价指标相关的损失函数,并将其运用到训练准则中,用于改进系统性能。通过实验证明,经过多方面优化后的语音分离系统,不仅可以有效缩短训练时间,而且实现了分离语音性能指标的综合提升。 Using long and short⁃term memory network to separate speech can make good use of the temporal correlation of speech signals and improve the intelligibility of separated speech,but at the same time,it brings problems such as complex calculation,long training time and so on,and the effect of improving speech perception and evaluation is not good.To solve this problem,the unit structure with fewer parameters is used to optimize the model and shorten the training time;in order to further improve the speech quality and intelligibility of the target speech,the input characteristics of the model are optimized by combining the self attention mechanism to suppress the influence of noise dominated time⁃frequency unit on the separation results;in order to better match the performance indicators of speech separation,the speech evaluation index is proposed.The loss function is used to improve the performance of the system.Experiments show that the optimized speech separation system can not only effectively shorten the training time,but also achieve the comprehensive improvement of speech separation performance.
作者 郭佳敏 李鸿燕 GUO Jiamin;LI Hongyan(College of Information and Computer,Taiyuan University of Technology,Taiyuan 030024,China)
出处 《电子设计工程》 2021年第11期140-145,150,共7页 Electronic Design Engineering
基金 山西省自然科学基金资助项目(201701D121058)。
关键词 深度学习 语音分离 长短时记忆网络 自注意力 损失函数 语音评价指标 deep learning speech separation LSTM self⁃attention loss function voice evaluation indicators
  • 相关文献

参考文献12

二级参考文献152

共引文献150

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部