摘要
为解决采用矢量量化的方法进行说话人识别时出现的失真问题,根据汉语语音的发音特性,提出了将矢量量化与语音特征的聚类技术相结合的方法,在进行矢量量化码书训练之前,先对特征矢量进行聚类筛选。实验结果表明,当测试语音片段长度为4s时,在保持95%左右识别率下,采用普通矢量量化方法需64码本数,而采用该文方法只需8码本数,降低了8倍。结果说明该方法不但在一定程度上解决了因训练样本不足而引起的失真问题,而且通过方法的改进,实现了采用较低码字数产生较好的识别结果,从而提高识别效率。
In this paper,in order to solve the problem of distortion in speaker recognition with vector quantization,we propose a method in which we apply speaker feature based on speech clustering to vector quantization in speaker recognition.Before codebook training,the training samples of speakers would be clustered and tiltrated.The experiment showed that it could reduce the number of codebook from 64 with simple vector quantization to 8 with VQ based on clustering features.The result showed:on the one hand,with the approach,the problem of distortion because of the lack of training samples would be solved to a certain extent,on the other hand,better recognition results would be acquired in lower number of codebook with the approach.In other word,the efficiency of speaker recognition is to be increased.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第27期196-198,208,共4页
Computer Engineering and Applications