期刊文献+

基于辨别性深度信念网络的说话人分割 被引量:10

Speaker segmentation based on discriminative deep belief networks
原文传递
导出
摘要 该文基于语音信号的超矢量特征空间,提出了一种基于Fisher准则的可辨别性深度信念网络(discriminativedeep belief network,DDBN)训练方法,得到了优于传统深度信念网络(deep belief network,DBN)的说话人码本矢量特征,并利用这些码本特征对多说话人的音段进行了聚类与分割。由TIMIT数据库生成的多说话人语音分割的实验结果表明,该基于Fisher准则函数的DDBN说话人分割算法的性能明显好于传统的Bayes信息判决(Bayesian informa-tion criterion,BIC)法和DBN法。 A discriminative deep belief network(DDBN) based on the Fisher criterion is used here to calculate the super-vector feature space of speech signals.The network extracts the feature codebook of the speaker that is superior to the one from the traditional deep belief network(DBN) algorithm for multi-speaker clustering and segmentation.Evaluations on the multi-speaker audio stream corpus generated from the TIMIT database show that the speaker segmentation algorithm based on the DDBN with the Fisher criterion performs better than the traditional Bayesian information criterion(BIC) method and the DBN method.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2013年第6期804-807,812,共5页 Journal of Tsinghua University(Science and Technology)
基金 北京市教育委员会科技计划重点项目(KZ201110005005) 国家自然科学基金项目(61072089)
关键词 说话人分割 辨别性深度信念网络 FISHER准则 speaker segmentation discriminative deep belief network Fisher criterion
  • 相关文献

参考文献12

  • 1Tranter S, Reynolds D. An overview of automatic speaker diarization systems [J]. IEEE Transactions on Audio, Speech and Language Processing, 2006, 14: 1557- 1565.
  • 2Kotti M, Moschou V, Kotropoulos C. Speaker segmentation and clustering [J]. Signal Processing, 2008, 88:1091 -1124.
  • 3Castaldo F, Colibro D, Dalmasso E, et al. Stream-based speaker segmentation using speaker factors and eigenvoices [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, NV, USA: IEEE Press, 2008, 4133-4136.
  • 4Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks [J]. Science, 2006, 313: 504-507.
  • 5Honglak Lee, Grosse R, Ranganath R, et al. Convolution deep belief networks for scalable unsupervised learning of hierarchical representations [C]// The 26th International Conference on Machine Learning. Montreal, Canada, 2009: 609 - 616.
  • 6Lee H, Largman Y, Pham P, et al, Unsupervised feature learning for audio classification using convolutional deep belief networks[C]// Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2009 : 1 - 9.
  • 7DONG Yu, LI Deng. Deep learning and its applications to signal and information processing [J]. IEEE Signal Process Mag, 2011, 28: 145-154.
  • 8Chen K, Salman A. I.earning speaker-specific characteristics with a deep neural architecture [J]. IEEE Trans Neural Networks, 2011, 22:1744 - 1756.
  • 9Hinton G E. A Practical Guide to Training Restricted Boltzmann Machines [R]. UTML Teeh Report 2010 003. Toronto, Canada: Univ of Toronto, 2010.
  • 10Wong W K, SUN Mingming. Deep learning regularized Fisher mappings[J]. IEEE Trans Neural Networks, 2011, 22: 1668-1675.

同被引文献93

引证文献10

二级引证文献216

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部