摘要
为了提高对话机器人人机交互的准确率,提出一种基于协作递归网络的语音增强方法对语音分析模块进行优化。方法首先基于广义最小绝对偏差方法构建语音信号的AR参数估计模型,并采用深度递归Q网络对模型进行求解;再根据所求参数,通过卡尔曼滤波器递归网络依次还原语音信号数据。实验证明,在语音增强测试中,采用所提方法进行语音消噪,相较于改进谱减法、YW估计自适应卡尔曼滤波法和MG自适应卡尔曼滤波法等常用语音增强方法,可以更好地还原语音信号,尤其是在短视语音的消噪上,减小了语音失真,大幅提高了语音的信噪比。在人机交互测试中,基于所提语音增强方法优化的人机交互系统,对话识别准确率达到了93.33%,相较于未优化的系统,提高了16.66%,性能优越性明显,更满足对话机器人人机交互需求。
In order to improve the accuracy of human-robot interaction,a speech enhancement method based on cooperative recursive network is proposed to optimize the speech analysis module.Methods Firstly,the AR parameter estimation model of speech signal is constructed based on generalized minimum absolute deviation method,and the model is solved by deep recursive Q network.Then,according to the parameters,the speech signal data is successively restored through the recursive network of Kalman filter.The experiments show that the proposed method can better restore the speech signal in speech enhancement test,especially in short-sighted speech denoising,reduce the speech distortion and greatly improve the signal-to-noise ratio of speech,compared with the conventional speech enhancement methods such as improved spectral subtraction,YW estimation adaptive Kalman filter and MG adaptive Kalman filter.In the human-computer interaction test,the human-computer interaction system optimized based on the proposed speech enhancement method has a dialogue recognition accuracy of 93.33%,which is 16.66%higher than that of the non-optimized system,showing obvious performance advantages and better meeting the human-computer interaction requirements of dialogue robots.
作者
任芳
REN Fang(Xi’an Fanyi University,Xi’an 710105,China)
出处
《自动化与仪器仪表》
2024年第3期184-188,共5页
Automation & Instrumentation
基金
陕西省“十四五”教育科学规划2022年度课题《英语专业视听说数字化教学与测评应用研究》(SGH22Y1777)。
关键词
递归网络
Q学习
卡尔曼滤波
语音增强
语音交互
recursive network
Q learning
kalman filtering
speech enhancement
voice interaction