摘要
研究了基于动态时间规正(DTW)和图论方法的语音识别和说话人识别的特征子集选择问题,提出了基于DTW距离的有向图方法(DTWDAG)。此方法推广了基于欧氏距离的相似矩阵聚类方法,将图论聚类方法改进为语音和说话人特征选择的代价函数。并将此代价函数与(l-r)优化算法结合应用于孤立数字的特定人的语音识别和文本有关的说话人辩认的特征选择,实验结果表明,DTWDAG方法能够较好反映语音识别和说话人识别的特征子集的重要性。
In this paper, a DTW-based graph theoretic method for feature subset selection of speech recognition and speaker recognition is discussed , and a DTW-based directed acyclic graph optimization method (DTWDAG) is proposed. We extend the Euclidean-distance based similarity matrix clustering method to DTW-based similarity matrix clustering, and construct a cost function according to similarity matrix. Combining the cost function with (l-r) optimization algorithm, the method is applied to the isolated digital speaker-dependent speech recognition and text-dependent speaker identification. The experiment results demonstrate the efficient performance of DTWDAG in feature subset selection processing.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2005年第1期50-54,共5页
Pattern Recognition and Artificial Intelligence
关键词
特征选择
相似矩阵
动态时间规正
(l—r)优化算法
Feature Selection
Similarity Matrix
Dynamic Time Warping
(l-r) Optimization Algorithm