期刊文献+

医疗场景下智能语音技术难点及解决方法探讨 被引量:5

Discussion on Challenges and Key Techniques of Intelligent Speech Technologies in Medicine
下载PDF
导出
摘要 在医疗场景下应用语音技术的核心是语音识别,目前其难点主要在于噪音和重口音,通过麦克风阵列信号处理技术进行定向降噪,通过变异发音单元检测,预测和迁移学习模型训练技术,以优化重口音语音识别,分别可以达到绝对8.9%和2.8%的字准确率提升。再结合语言定制技术,使医疗场景下的语音识别技术在多个场景可以达到97%的语音识别字准确率。 The core technology of the speech technologies for medicine is speech recognition.At present,the difficulties of the medical speech recognition mainly lie in the robustness of the variety noise and accents.This paper mainly uses microphone array signal processing technology for directional noise reduction,and optimizes accent speech recognition through variant pronunciation unit detection,prediction and transfer learning model training technology.Our experiments show that our proposed techniques can bring absolute 8.9%and 2.8%character error rate respectively for improving the performances of noise reduction and accent adaptation.
作者 李轶杰 关海欣 刘升平 LI Yi-jie;GUAN Hai-xin;LIU Sheng-ping(Unisound AI Technology Co.,Ltd.,Beijing 100096,P.R.C.)
出处 《中国数字医学》 2021年第8期7-11,共5页 China Digital Medicine
关键词 语音识别 智能语音 医疗场景 技术应用 电子病历 speech recognition intelligent speech medical application scenario technology application electronic medical record
  • 相关文献

参考文献2

二级参考文献15

  • 1李净,郑方,William Byrne,Dan Jurafsky.A Dialectal Chinese Speech Recognition Framework[J].Journal of Computer Science & Technology,2006,21(1):106-115. 被引量:7
  • 2Liu Y, Fung P. Multi-accent Chinese speech recognition [C]// Proc of INTERSPEECH 2006. Pittsburg PA, USA: Curran Associates, 2008: 1887-Monl BuP. 8.
  • 3中国语言文字使用情况调查领导小组办公室.中国语言文字使用情况[M].北京:语文出版社,2006.
  • 4Liu L Q, Zheng F, Akabane M. Using a small development data set to build a robust dialectal Chinese speech recognizer [C]//Proc of INTERSPEECH 2007. Antwerp, Belgium: Curran Associates, 2008 = 1729 - 1732.
  • 5Riley M., Ljolje A. Automatic generation of detailed pronunciation lexicons [J]. In Automatic Speech and Speaker Recognition : Advanced Topics, 1995, 12: 285 - 302.
  • 6Tomokiyo L M. Recognizing Non-native Speech : Characterizing and Adapting to Non-native Usage in LVCSR ED]. Pittsburg, USA: Carnegie-Mellon University, 2001.
  • 7WANG Z, Schultz T, Waibel A. Comparison of acoustic model adaptation techniques on non native speech [C]//Proc of ICASSP 2003. Hong Kong: IEEE Press, 2003: 540-543.
  • 8Oh Y, Kim H. MLLR/MAP adaptation using pronunciation variation for non native speech recognition [C]//Proe of ASRU2009. Merano, Italy: IEEEPress, 2009:216-221.
  • 9Saraclar M, Nock H, Khudanpur S. Pronunciation modeling by sharing Gaussian densities across phonetic models[J].Computer Speech and Language, 2000, 14:137 - 160.
  • 10Liu Y. Pronunciation Modeling for Spontaneous Mandarin Speech Recognition [D]. Hong Kong: Hong Kong University of Science and Technology, 2002.

共引文献7

同被引文献35

引证文献5

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部