摘要
目的低资源(low-resource)语言的无监督的关键词检测技术近年来引起了广泛的研究兴趣。低资源语言由于缺乏足够的标注数据及相关的专家知识,使得传统的基于大词汇量语音识别系统的关键词检测技术无法使用。近年来,研究者试图寻找一种无监督的技术来完成针对低资源语言的语音关键词检测。方法首先阐述了该技术目前面临的问题与挑战,然后介绍了该技术使用的主流的基于动态时间规整的算法框架,并从特征表示、模板匹配方法、效率提升等几个重要方面介绍了近几年来主要的研究成果,最后介绍了该任务常用的系统评价标准及目前所能达到的水平,讨论了未来可能的研究方向。结果该任务的研究目前取得了很多成果,但仍处于实验室阶段,多系统融合策略导致系统庞大,而且目前还没有好的进行索引的方法,导致检测时间过长,对于低资源语音的关键词检测技术,还有很多研究工作要做。结论期望通过对目前低资源语言的无监督的关键词检测技术做出一个全面的综述,从而给研究者的工作带来便利。
Objective Query-by-example spoken term detection for low-resource languages has recently drawn considerable research interest. For low-resource languages that lack sufficient annotated data and related expert knowledge, spoken term detection techniques based on traditional large vocabulary speech recognition cannot be directly used. Researchers have re- cently attempted to determine an unsupervised technique to perform this task for low-resource languages. Method In this study, we first present the challenges confronting this task. We then introduce the algorithm framework based on dynamic time warping (DTW) commonly used in this task. We finally present the recent research devoted to feature representation, template matching, speed-up, and other related topics. Result Although the research of this technique on low-resource language has got much progress, there are not real-life applications. Some unified feature representation and indexing method must be proposed to attain both good effectiveness and efficiency. Conclusion We present the commonly used performance evaluation standards. The conclusion of our investigation is presented, and possible future research directions are discussed.
出处
《中国图象图形学报》
CSCD
北大核心
2015年第2期211-218,共8页
Journal of Image and Graphics
基金
国家自然科学基金项目(61175018)
霍英东青年教师基础研究基金项目(131059)