摘要
在医疗场景下应用语音技术的核心是语音识别,目前其难点主要在于噪音和重口音,通过麦克风阵列信号处理技术进行定向降噪,通过变异发音单元检测,预测和迁移学习模型训练技术,以优化重口音语音识别,分别可以达到绝对8.9%和2.8%的字准确率提升。再结合语言定制技术,使医疗场景下的语音识别技术在多个场景可以达到97%的语音识别字准确率。
The core technology of the speech technologies for medicine is speech recognition.At present,the difficulties of the medical speech recognition mainly lie in the robustness of the variety noise and accents.This paper mainly uses microphone array signal processing technology for directional noise reduction,and optimizes accent speech recognition through variant pronunciation unit detection,prediction and transfer learning model training technology.Our experiments show that our proposed techniques can bring absolute 8.9%and 2.8%character error rate respectively for improving the performances of noise reduction and accent adaptation.
作者
李轶杰
关海欣
刘升平
LI Yi-jie;GUAN Hai-xin;LIU Sheng-ping(Unisound AI Technology Co.,Ltd.,Beijing 100096,P.R.C.)
出处
《中国数字医学》
2021年第8期7-11,共5页
China Digital Medicine
关键词
语音识别
智能语音
医疗场景
技术应用
电子病历
speech recognition
intelligent speech
medical application scenario
technology application
electronic medical record