期刊文献+

用于腹腔镜扶持器控制的特定人语音识别算法 被引量:3

Speaker-Dependent Speech Recognition Algorithm for Laparoscopic Supporter Control
原文传递
导出
摘要 提出了一种基于融合i-vector特征的长短时记忆(LSTM)循环神经网络模型,用于腹腔镜扶持器语音控制,在小训练样本下实现对特定医生语音中的短时、孤立词指令的识别。该模型以LSTM循环神经网络作为基础模型,以梅尔频率倒谱系数(MFCC)作为输入特征参数,将i-vector特征作为LSTM循环神经网络的深层输入信息,与神经网络中LSTM层后的深层特征信息进行拼接,达到参数融合的目的,实现对特定主刀医生语音指令的准确识别以及对非主刀医生语音指令的拒识别,为腹腔镜操作提供安全智能的语音识别方案。使用自建语音库进行实验,分别验证所提算法对训练库内语音的识别性能以及对训练库外语音的拒识别性能。实验结果表明:与动态时间规整算法(DTW)和混合高斯模型-隐马尔可夫模型(GMM-HMM)相比,所提模型在对训练库内特定人语音指令识别正确率高达99.6%的同时保持着错误接受率为0%,对训练库外语音的平均错误接受率为2.5%,满足腹腔镜扶持器控制的准确性和安全性要求。 A long short-term memory(LSTM)recurrent neural network based on an i-vector feature is presented for speech control of laparoscopic supporter to realize short-term isolated word command recognition from the speech of a specific doctor using small training samples.In this model,LSTM recurrent neural network is used as the basic model,Mel-frequency cepstrum coefficient(MFCC)is used as the input characteristic parameter,i-vector feature is used as the deep input information of LSTM recurrent neural network,and the deep feature information behind LSTM layer in the neural network is spliced to achieve the purpose of parameter fusion,so as to realize the accurate recognition of the voice instructions of the specific surgeon and the rejection recognition of the voice instructions of the non surgeon.This approach offers a secure and intelligent speech recognition scheme for laparoscopic surgeries.Further,a self-built speech database is used as a training library to verify speech recognition performance of the proposed algorithm as well as its rejection performance for the speech not included in the training library.Experiments show that compared with dynamic time warping(DTW)and Gaussian mixture model-Hidden Markov model(GMM-HMM),the proposed model exhibits a 99.6%correct recognition rate for voice commands of specific people recorded in the training library while maintaining a false acceptance rate of 0%,with an average false acceptance rate of 2.5%for voices not included in the training library.The proposed model meets the requirements of accuracy and safety expected by laparoscopic supporter control standards.
作者 任凯龙 汪毅 陈晓冬 蔡怀宇 Ren Kailong;Wang Yi;Chen Xiaodong;Cai Huaiyu(School of Pvecision Instruments and Optoelectronwics Eagineering,Tianjin University,Tianjiu 300072,China)
出处 《激光与光电子学进展》 CSCD 北大核心 2020年第18期374-382,共9页 Laser & Optoelectronics Progress
关键词 医用光学 腹腔镜 i-vector 长短时记忆 特定人语音识别 medical optics laparoscope i-vector long short-term memory speaker-dependent speech recognition
  • 相关文献

参考文献7

二级参考文献46

  • 1吴志勇,蔡莲红.基于动态贝叶斯网络的音视频双模态说话人识别[J].计算机研究与发展,2006,43(3):470-475. 被引量:11
  • 2Kinnunen T, Li H Z. An overview of text-independent speaker recognition: from features to supervectors. Speech Communication, 2010, 52(1): 12-40.
  • 3Dehak N, Kenny P, Ouellet P, Dumouchel P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 2011, 19(4): 788-798.
  • 4Campbell W M, Campbell J P, Reynolds D A, Singer E, Torres-Carrasquillo P A. Support vector machines for speaker and language recognition. Computer Speech and Language, 2006, 20(2-3): 210-229.
  • 5Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(4): 1448-1460.
  • 6Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(4): 1435-1447.
  • 7Reynolds D A, Quatieri T F, Dunn R B. Speaker verifica- tion using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41.
  • 8Cortes C, Vapnik V. Support vector networks. Machine Learning, 1995, 20(3): 273-297.
  • 9Kenny P, Boulianne G, Dumouchel P. Eigenvoice model- ing with sparse training data. IEEE Transactions on Audio, Speech, and Language Processing, 2005, 13(3): 345-354.
  • 10Bishop C M. Pattern Recognition and Machine Learning. Berlin: Springer, 2008.

共引文献89

同被引文献19

引证文献3

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部