期刊文献+

说话人识别中基于聚类特征的矢量量化技术 被引量:2

Vector quantization technology based on clustering features in speaker recognition
下载PDF
导出
摘要 为解决采用矢量量化的方法进行说话人识别时出现的失真问题,根据汉语语音的发音特性,提出了将矢量量化与语音特征的聚类技术相结合的方法,在进行矢量量化码书训练之前,先对特征矢量进行聚类筛选。实验结果表明,当测试语音片段长度为4s时,在保持95%左右识别率下,采用普通矢量量化方法需64码本数,而采用该文方法只需8码本数,降低了8倍。结果说明该方法不但在一定程度上解决了因训练样本不足而引起的失真问题,而且通过方法的改进,实现了采用较低码字数产生较好的识别结果,从而提高识别效率。 In this paper,in order to solve the problem of distortion in speaker recognition with vector quantization,we propose a method in which we apply speaker feature based on speech clustering to vector quantization in speaker recognition.Before codebook training,the training samples of speakers would be clustered and tiltrated.The experiment showed that it could reduce the number of codebook from 64 with simple vector quantization to 8 with VQ based on clustering features.The result showed:on the one hand,with the approach,the problem of distortion because of the lack of training samples would be solved to a certain extent,on the other hand,better recognition results would be acquired in lower number of codebook with the approach.In other word,the efficiency of speaker recognition is to be increased.
出处 《计算机工程与应用》 CSCD 北大核心 2007年第27期196-198,208,共4页 Computer Engineering and Applications
关键词 说话人识别 矢量量化 聚类特征 MEL频率倒谱系数 speaker recognition vector quantization clustering features MFCC
  • 相关文献

参考文献6

  • 1张庆芳,赵鹤鸣.基于改进VQ算法的文本无关的说话人识别[J].计算机工程与应用,2006,42(10):65-68. 被引量:7
  • 2Kinnunen T,Karpov E,Franti P.Real-time speaker identification and verification[J].IEEE Transactions on Audio,Speech,and Language Processing,2006,14(1):277-288.
  • 3Zhang L,Zheng B,Yang Z.Codebook design using genetic algorithm and its application to speaker identification[J].Electronics Letters,2005,41 (10).
  • 4Kinnunen T.Spectral features for automatic text-independent speaker recognition[D].University of Joensuu,Department of Computer Science,2003.
  • 5Itakura F.Minimum prediction residual principle applied to speech recognition[J].IEEE Transactions on Acoustics,Speech,and Signal Processing,1975,23(1):67-72.
  • 6张炜,胡起秀,吴文虎.距离加权矢量量化文本无关的说话人识别[J].清华大学学报(自然科学版),1997,37(3):20-23. 被引量:15

二级参考文献9

  • 1陈永彬,语言信号处理,1990年
  • 2F Soong,A Rosenberg,L Rabiner et al.A vector quan-tization approach to speaker recognition[C].In:Proc of the International Conference on Acoustics,Speech,and Signal Processing(ICASSP),1985 ;1:387~390
  • 3Linde Y,Buzo A,Gray R M.An algorithm for vector quantizer design[J].IEEE Transactions on Communication,1980; 28:84~95.
  • 4H C Huang,J S Pan,Z M Lu et al.Vector Quantization Based on Genetic Simulated Annealing[J].Signal Processing,2001 ;81(7):1513~1523
  • 5Chen Ke,Wu Ting-Yao,Zhang Hong-Jiang.On the use of nearest feature line for speaker identification[J].Pattem Recognition Letters,2002 ;23(4):1735~1746
  • 6K S Wu,J C Lin.Fast VQ encoding by an efficient kick-out condition[J].IEEE Transactions on Circuits and Systems for Video Technology,2000; 10(1):59~62
  • 7陆哲明,潘正祥,孙圣和.基于改进禁止搜索算法的矢量量化码书设计[J].电子学报,2000,28(9):108-110. 被引量:11
  • 8陆哲明,孙圣和.基于自组织特征映射神经网络的矢量量化[J].中国图象图形学报(A辑),2000,5(10):846-850. 被引量:10
  • 9罗雪晖,李霞,张基宏.一种改进的LBG快速算法[J].深圳大学学报(理工版),2002,19(4):54-59. 被引量:10

共引文献19

同被引文献19

  • 1张庆芳,赵鹤鸣.基于改进VQ算法的文本无关的说话人识别[J].计算机工程与应用,2006,42(10):65-68. 被引量:7
  • 2胡征.矢量量化原理及应用[M].西安:西安电子科技大学出版社.1998.
  • 3冯松,张述清.隐马尔科夫模型在说话人识别中的应用[J].计算机科学,2006,33(9).
  • 4Ferrer L, Shriberg E, Kajarekar S S, et al.The contribu- tion of cepstral and stylistic features to SRI's 2005 NIST speaker recognition evaluation system[C]//Proc Int'l Conf Acoust, Speech Signal Process(ICASSP),Tou- louse, France, 2006 : 101-103.
  • 5Zheng H, Hellwich O.Adaptive data-driven regularization for variational image restoration in the BV Space[C]// Proceedings of VISAPP' 07, Barcelona, Spain, 2007: 53-60.
  • 6Aharon M.Overcomplete dictionaries for sparse represen- tation of signals[D].Thesis, Computer Science Depart- ment,the Senate of the Technion-Israel Institute of Tech- nology, 2006.
  • 7Gemmeke J F, Cranen B.Using sparse representations for missing data imputation in noise robust speech rec- ognition[C]//European Signal Processing Conf(EUSIPCO), Lausanne,Switzerland,August 2008.
  • 8Elad M,Aharon M.Image denoising via sparse and re- dundant representations over learned dictionaries[J]. IEEE Transactions on Image Processing, 2006, 15 (2) : 3736-3744.
  • 9Jafari M G, Plumbley M D.Fast dictionary learning for sparse representations of speech signals[J].IEEE Journal of Selected Topics in Signal Processing, Spe- cial issue on Adaptive SparseRepresentation of Data and Applications in Signal and Image Processing,2011.
  • 10Aharon M,Elad M,Bruckstein A M.The K-SVD:an al- gorithm for designing of overcomplete dictionaries for sparse representation[J].IEEE Trans on Signal Process- ing, 2006,54( 11 ) : 4311-4322.

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部