摘要
从语音信号特征矢量的空间映射出发,在二元分裂算法的基础上提出了一种分裂合并的聚类算法,并用于与文本无关的说话人识别研究,初步建立了基于聚类统计的开放系统。该系统用说话人语音信号在特征空间的分布中心建立参考模板,用聚类统计中心代替待识语音段的特征矢量进行模式匹配计算,系统规模的越大,节省的计算量就越多。在小规模说话人辨认系统的实验研究中,研究了特征矢量的加权、语音段的时长以及α因子的选择对系统性能的影响。
Speech signal is usually converted to a series of feature vectors, the vector representation of individual speech has its particular distribution in characteristic space. A split-combine-clustering algorithm based on binary-splitting algorithm is proposed and applied for text-independent speaker recognition. The algorithm can obtain fixed result including self-adaptive amounts of centers and converge rapidly. Based on clustering statistic principle, an open-set text-independent speaker recognition system is established in preliminary. Its reference template is established by the distributive centers of the individual feature vectors, which are also used for pattern comparing in testing process other than feature vectors themselves. So the larger the recognition system scale is, the more the computational cost is saved than conventional VQ method. Then the influence of the weighting of feature vectors, the time duration of speech segments and the various choose of factorαon the performance of a small open-set text-independent speaker identification system is researched by experiments.
出处
《电路与系统学报》
CSCD
2001年第3期77-80,共4页
Journal of Circuits and Systems
关键词
说话人识别
聚类统计
语音识别
Speaker recognition
text-independent
clustering
cepstrum.