摘要
以人机交互系统中特定对象语音识别功能实验为典型案例,采用引入过减因子与谱下限的谱减法进行含噪语音信号的增强降噪,以混合高斯-通用背景模型(Gaussian Mixture Model-Universal Background Model,GMM-UBM)为基础,基于梅尔频率倒谱系数(Mel-Frequency Cepstral Coefficients,MFCC)特征进行语音对象识别,并部署端到端的深度神经网络普通话语音识别模块,完成综合创新性实验教学任务和创新拓展实践训练。实验表明,部署上述模块的服务机器人能在平均0.896 s时间内准确完成整个流程,证明了设计方案的可行性与有效性。
In view of the characteristics of both theoretical and practical engineering of indoor service robot related professional courses,the speech recognition experiment within human-machine interaction system is implemented as a typical solution.The spectral subtraction method with the over-subtraction factor and the spectral lower limit is used to enhance the speech signal and reduce the noise.Based on the GMM-UBM(Gaussian Mixture Model-Universal Background Model),speech object recognition is carried out using MFCC(Mel-Frequency Cepstral Coefficients)features,and the end-to-end deep neural network for Mandarin speech recognition module is deployed to complete the comprehensive innovative experimental teaching task and innovative development practice training.Practice shows that the deployed service robot can complete the whole task within 0.896 s on average,proving its feasibility and availability.
作者
梁伊雯
韩子奇
张志明
孙艺珈
LIANG Yiwen;HAN Ziqi;ZHANG Zhiming;SUN Yijia(School of Software Engineering,Tongji University,Shanghai 200092,China;College of Electronics and Information Engineering,Tongji University,Shanghai 200092,China)
出处
《实验室研究与探索》
CAS
北大核心
2023年第1期30-35,共6页
Research and Exploration In Laboratory
基金
教育部产学合作协同育人项目(202101303027,201902016059)
上海市级大学生创新创业训练计划项目(S202110247020)
同济大学双一流引导专项竞赛立项项目(4250145305/004)。
关键词
服务机器人
人机交互
语音识别
说话人识别
语音增强
service robot
human-machine interaction
speech recognition
speaker recognition
speech enhancement