期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
语音识别程序的开发与应用 被引量:2
1
作者 申建国 王暖臣 《计算机应用研究》 CSCD 2000年第12期77-78,99,共3页
介绍了语音识别的一般知识,阐述了开发语音识别程序的方法及开发过程中应注意的问题,并给出了语音识别的应用实例。
关键词 语音识别程序 开发 命令控制 语音信号处理 计算机
下载PDF
Past review,current progress,and challenges ahead on the cocktail party problem 被引量:3
2
作者 Yan-min QIAN Chao WENG +2 位作者 Xuan-kai CHANG Shuai WANG Dong YU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2018年第1期40-63,共24页
The cocktail party problem,i.e.,tracing and recognizing the speech of a specific speaker when multiple speakers talk simultaneously,is one of the critical problems yet to be solved to enable the wide application of au... The cocktail party problem,i.e.,tracing and recognizing the speech of a specific speaker when multiple speakers talk simultaneously,is one of the critical problems yet to be solved to enable the wide application of automatic speech recognition(ASR) systems.In this overview paper,we review the techniques proposed in the last two decades in attacking this problem.We focus our discussions on the speech separation problem given its central role in the cocktail party environment,and describe the conventional single-channel techniques such as computational auditory scene analysis(CASA),non-negative matrix factorization(NMF) and generative models,the conventional multi-channel techniques such as beamforming and multi-channel blind source separation,and the newly developed deep learning-based techniques,such as deep clustering(DPCL),the deep attractor network(DANet),and permutation invariant training(PIT).We also present techniques developed to improve ASR accuracy and speaker identification in the cocktail party environment.We argue effectively exploiting information in the microphone array,the acoustic training set,and the language itself using a more powerful model.Better optimization ob jective and techniques will be the approach to solving the cocktail party problem. 展开更多
关键词 Cocktail party problem Computational auditory scene analysis Non-negative matrix factorization Permutation invariant training Multi-talker speech processing
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部