摘要
语音关键词识别技术作为语音识别的重要分支在20世纪90年代逐渐被重视起来,时至今日,语音关键词识别技术已经被应用到车内语音命令识别、机器人交互及特殊语音筛选等众多领域。本文给出了语音关键词识别技术的整体模型及性能评价指标,综述了语音关键词识别系统声学模型构建技术的现状,详细总结了语音关键词识别系统声学模型构建技术,并重点总结了深度学习在声学模型构建上的应用。最后对语音关键词识别技术的发展前景进行了讨论,认为深度学习隐马尔科夫混合模型作为连续语音识别中最成熟的模型构建技术将在关键词识别中有更多应用,循环神经网络有可能凭借其序列训练能力成为更有效的模型构建技术,而大计算量、云平台及便携可穿戴将会成为语音关键词识别技术发展的主流方向。
As an important branch of speech recognition,acoustics keywords spotting( AKS) technology had been paid attention to in the 1990 s,which,nowadays,has been applied to many fields such as in-car audio command recognition,robot interaction and special speechs spotting. The overall model and the evaluating index sign of AKS was given in this paper,as well as the acoustic model of ASK,concentrating on application of deep neural network( DNN) in acoustic model. In the end,the prospects of AKS technology were discussed which believed that DNN-HMM will have more applications in AKS as the mature model,recurrent neural networks( RNN) may become a more efficient model relied on its sequence training ability,and the large computation,cloud platform,portable,wearable devices will be the mainstream of ASK technology.
出处
《燕山大学学报》
CAS
北大核心
2017年第6期471-481,共11页
Journal of Yanshan University
基金
国家自然科学基金资助项目(61271248)
关键词
语音关键词识别
动态时间规整
隐马尔科夫
深度神经网络
循环神经网络
acoustics keywords spotting
dynamic time warping
hidden Markov model
deep neural network
support vector machine
recurrent neural networks