摘要
随着深度学习技术研究的深入,语音识别同样已经完成了从传统模型到深度学习的过渡,本文主要实现移动端离线状态下的语音识别并提高语音识别的精度。文中采用深度学习的方式,将在电脑上训练好的模型移植到树莓派3b+上进行语音识别操作。项目整体结构可以分为声学模型及语言模型两个部分,同语音识别中其他主流模型进行对比测试后,得到的结论是声学模型DFCNN和语言模型Transformer的编码器部分都适合移植于嵌入式端,在成本远低于市场上既有语音识别产品的情况下,识别效果和速度都非常接近。
With the research progress of deep learning technology,speech recognition has also completed the transition from traditional model to deep learning.The main purpose of this paper is to solve the speech recognition under the mobile offline state and improve the accuracy of speech recognition.In the paper,the method of deep learning is used to transplant the model trained on the computer to the Raspberry Pi 3b+for speech recognition.The overall structure of the project can be divided into two parts:acoustic model and language model.Comparing with other mainstream models in speech recognition,the conclusion is that the encoder part of the acoustic model DFCNN and the language model Transformer are suitable for transplantation on the embedded end.The recognition effect and speed are very close when the cost is much lower than that of the existing speech recognition products on the market.
作者
谭磊
余欣洋
罗伟洋
曾维
代云强
Tan Lei;Yu Xinyang;Luo Weiyang;Zeng Wei;Dai Yunqiang(College of Information Science and Technology,Chengdu University of Technology,Chengdu 610059,China)
出处
《单片机与嵌入式系统应用》
2020年第9期28-31,35,共5页
Microcontrollers & Embedded Systems
基金
基于图像识别的主动式显示器支架(S201910616036)
一种应用于公共交通领域的人包联动管理系统(S201910616037)
基于WSN的楼宇灾难应急疏散系统(S201910616133)。
关键词
深度学习
嵌入式系统
语音识别
声学模型
语言模型
deep learning
embedded system
speech recognition
acoustic model
language model