摘要
语音交互技术早在20世纪就被研究者视为追捧的对象。生活中方方面面都离不开交流。早期传统机器学习的方法已经无法满足多元化语音交流的需求。针对语音交互技术存在的分离质量低、结果不准确等问题,利用一种带有时序卷积因子的全卷积分离网络来分离语音,通过编码器混合片段语音,解码器重构语音波形,得到分离结果。实验结果表明,该模型计算量小,延迟相对较短,是解决语音分离相对较优的方法。
Voice interaction technology has been regarded as the object of pursuit by researchers as early as the 20th century. All aspects of Chinese life are inseparable from exchanges. The early traditional machine learning methods can not meet the needs of diversified voice communication. Aiming at the problems of low separation quality and inaccurate results in speech interaction technology, a full convolution separation network with timing convolution factor is used to separate speech. The speech waveform is reconstructed by encoder and decoder. The experimental results show that the model has small amount of calculation and relatively short delay. It is a relatively optimal method to solve speech separation.
作者
陈瑶
CHEN Yao(University of Xijing,Xi’an 710123,China)
出处
《电声技术》
2022年第4期47-49,共3页
Audio Engineering
关键词
卷积神经网络
时域网络
语音分离
convolutional neural network
time domain network
speech separation