期刊文献+

一种基于说话者的无监督语音分割算法 被引量:3

An unsupervised speech segmentation algorithm based on the speaker
下载PDF
导出
摘要 手机对话语音中2个说话者之间存在着信道和声学特征上的差异,利用这种差异可以从对话语音中分出属于每个话者的语音部分。文章重点讨论了一种基于距离的无监督语音分割算法,并比较了欧氏距离及广义似然比和持续时间相结合的2种距离测度,后者利用假设检验的似然比来描述2个语音段之间的相似性,通过与文本无关的手机对话语音的话者确认系统实验,表明了它比前者更优越,能较好地检测出绝大部分的说话者改变点,且计算代价也较低。 There are differences between the channels and acoustic characters of the two speakers in cellular conversation,which can be applied to segment the speech of each speaker from the cellular conversation.An unsupervised metric-based speech segmentation algorithm is mainly discussed in this paper.And Euclidean distance measure and the distance measure based on generalized likelihood ratio(GLR) and duration are compared.The latter makes use of the likelihood ratio of hypothesis testing to describe the similarity between two speech segments.The text-independent speaker verification system shows the measure based on GLR and duration is better in verifying segment points with low computation cost.
出处 《合肥工业大学学报(自然科学版)》 CAS CSCD 北大核心 2010年第5期683-686,708,共5页 Journal of Hefei University of Technology:Natural Science
基金 浙江省安防系统测试资助项目(DB33/T334)
关键词 手机对话语音 GLR距离测度 无监督语音分割 cellular conversation generalized likelihood ratio(GLR) distance measure unsupervised speech segmentation
  • 相关文献

参考文献8

  • 1Gish H,Siu M H,Rohlicek R.Segregation of speakers for speech recognition and speaker identification[C] //Proceed-ing of the International Conference on Acoustics,Speech and Signal Processing(ICASSP),Toronto,2001:873-876.
  • 2Meignier S,Bonastre J F,Chagnollesu I M.Speaker utter-ances tying among speaker segmented audio documents u-sing hierarchical classification:towards speaker indexing of audio databases[C] //Proceeding of the International Con-ferenee on Speech Language Processing(ICSLP),Denver,2002:577-580.
  • 3Jin H,Kubala F,Schwartz R.Automatic speaker clustering[C] //Proceeding of the DARPA Speech Recognition Work-shop,Chantilly,2007:108-111.
  • 4Reynolds D A.Singer E.Blind clustering of speech utter-ances based on speaker and language characteristics[C] //Proceeding of the International Conference on Speech and Language Processing(ICSLP),Sydney,1998:3193-3196.
  • 5Bakis R,Chen S,Gopalakrishnan P S,et al.Transcription of broadcast news shows with the IBM large vocabulary speech recognition system[C] //Proceeding of the DARPA Speech Recognition Workshop,Chantilly,2007:67-72.
  • 6Delacourt P,Wellekens C.DISTBIC:a speaker-based seg-mentation for audio data indexing[J].Speech Communica-tions,2000,32:111-126.
  • 7张世磊,张树武,徐波.一种两层次无监督的音频分割算法[J].中文信息学报,2007,21(2):106-111. 被引量:5
  • 8Bonastre J F,Delacourt P,Fredouille C,A speaker tracking system based on speaker turn detection for NIST evaluation[C] //Proceeding of the International Conference on Acous-tics,Speech and Signal Processing(ICASSP),Istanbul,2000:1177-1180.

二级参考文献11

  • 1NIST Spoken Language Technology Evaluations: Benchmark Tests [EB/OL]. http://www. nist. gov/speech/tests/index. htm.
  • 2Zhou B, Hansen J. Efficient audio stream segmentation via T2 statistic based Bayesian information criterion[J]. IEEE Transactions on Speech Audio Process,2005, 13(4): 467-474.
  • 3Chen S, Gopalakrishnan P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [A]. DARPA Broadcast News Trans. and Under [C]. Workshop, 1998.8.
  • 4Delacourt P, Wellekens CJ. DISTBIC: a speaker-based segmentation for audio data indexing [J].Speech Communication, 2000, 32: 111-126.
  • 5Lu L, Zhang HJ. Real-Time Unsupervised Speaker Change Detection [A]. In: Proceedings of ICPR (2)2002 [C]. Quebec, Canada, 2002: 358-361.
  • 6Cheng S, Wang H. METRIC-SEQDAC: A Hybrid Approach for Audio Segmentation [A]. In: Proceedings of ICSLP2004 [C]. Jeju Island, Korea, 2004:1617-1620.
  • 7Cheng S, Wang H. A Sequential Metric-based Audio Segmentation Method via The Bayesian Information Criterion [A]. In: Proceedings of Eurospeech2003[C]. Geneva, Switzerland, 2003: 945-948.
  • 8Zhou B, Hansen J. Unsupervised Audio Stream Segmentation and Clustering Via the Bayesian Information Criterion [A]. In: Proceedings of ICSLP2000[C]. China, 2000:714-717.
  • 9J. Ajmera. Robust Audio Segmentation [D]. Ph. D.Thesis, 2004.
  • 10贾磊,穆向禺,徐波.广播语音的音频分割[J].中文信息学报,2002,16(1):37-42. 被引量:11

共引文献4

同被引文献17

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部