摘要
近年来,关键词检测技术在口语语音和电话语音领域取得了显著的发展,但针对流媒体语音关键词检测的有关文献却很少见,基于这个目的,提出一套针对流媒体关键词检测的系统方案。系统利用WMFSDK从流媒体中提取出解码的语音数据。为了区分集外词和关键词,利用了在线垃圾模型拒绝集外词并且得到多个关键词候选。在关键词确认阶段,把解码过程中得到的基于MAP的词置信度和N-best特征作为特征向量,设计了支持向量机(SVM)分类器。通过实验对SVM方法和传统的Fisher方法进行了比较,研究表明前者的应用效果整体优于后者。
During recent years, Significant progress has been made in keyword spotting (KWS) for spo- ken speech or telephone speech, but little reference is found concerning word spotting in audio data embedded in streaming media. A keyword recognition system scheme is proposed for streaming media based on audio document retrieval. In the system, the decoded audio data was retrieved from Streaming media via Microsoft Windows Media Format Soft Development Kit (WMFSDK). In order to distinguish between out-of-vocabulary (OOV) and vocabulary words, on-line garbage (OLG) model is proposed aiming to reject OOV and obtain keyword candidates. In utterance verification stage, a Support Vector Machine (SVM) classifier is designed whose input feature vectors consisting of the parameters based on the NBest results and the MAP-based word confident measures. Compared with the traditional Fisher method, results show that the former is more effective than the latter.
出处
《北京机械工业学院学报》
2006年第4期47-50,共4页
Journal of Beijing Institute of Machinery