期刊文献+

Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection 被引量:2

Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection
原文传递
导出
摘要 An important component of a spoken term detection (STD) system involves estimating confidence measures of hypothesised detections.A potential problem of the widely used lattice-based confidence estimation,however,is that the confidence scores are treated uniformly for all search terms,regardless of how much they may differ in terms of phonetic or linguistic properties.This problem is particularly evident for out-of-vocabulary (OOV) terms which tend to exhibit high intra-term diversity.To address the impact of term diversity on confidence measures,we propose in this work a term-dependent normalisation technique which compensates for term diversity in confidence estimation.We first derive an evaluation-metric-oriented normalisation that optimises the evaluation metric by compensating for the diverse occurrence rates among terms,and then propose a linear bias compensation and a discriminative compensation to deal with the bias problem that is inherent in lattice-based confidence measurement and from which the Term Specific Threshold (TST) approach suffers.We tested the proposed technique on speech data from the multi-party meeting domain with two state-ofthe-art STD systems based on phonemes and words respectively.The experimental results demonstrate that the confidence normalisation approach leads to a significant performance improvement in STD,particularly for OOV terms with phonemebased systems. An important component of a spoken term detection (STD) system involves estimating confidence measures of hypothesised detections.A potential problem of the widely used lattice-based confidence estimation,however,is that the confidence scores are treated uniformly for all search terms,regardless of how much they may differ in terms of phonetic or linguistic properties.This problem is particularly evident for out-of-vocabulary (OOV) terms which tend to exhibit high intra-term diversity.To address the impact of term diversity on confidence measures,we propose in this work a term-dependent normalisation technique which compensates for term diversity in confidence estimation.We first derive an evaluation-metric-oriented normalisation that optimises the evaluation metric by compensating for the diverse occurrence rates among terms,and then propose a linear bias compensation and a discriminative compensation to deal with the bias problem that is inherent in lattice-based confidence measurement and from which the Term Specific Threshold (TST) approach suffers.We tested the proposed technique on speech data from the multi-party meeting domain with two state-ofthe-art STD systems based on phonemes and words respectively.The experimental results demonstrate that the confidence normalisation approach leads to a significant performance improvement in STD,particularly for OOV terms with phonemebased systems.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第2期358-375,共18页 计算机科学技术学报(英文版)
关键词 confidence estimation discriminative model spoken term detection speech recognition confidence estimation,discriminative model,spoken term detection,speech recognition
  • 相关文献

参考文献68

  • 1Mamou J, Ramabhadran B, Siohan O. Vocabulary independent spoken term detection. In Proc. the 30th ACM-SIGIR, Amsterdam, the Netherlands, July 23-27, 2007, pp.615-622.
  • 2Mamou J, Ramabhadran B. Phonetic query expansion for spoken document retrieval. In Proc. the 9th INTERSPEECH, Brisbane, Australia, September 22-26, 2008, pp.2106-2109.
  • 3Can D, Cooper E, Sethy A, White C, Ramabhadran B, Saraclar M. Effect of pronunciations on OOV queries in spoken term detection. In Proc, ICASSP 2009, Taipei, China, April 19-24, 2009, pp.3957-3960.
  • 4Fiscus J G, Ajot J, Garofolo J S, Doddingt.ion G. Results of the 2006 spoken term detection evaluation. In Proc. Workshop on Searching Spontaneous Conversational Speech (SIGIR-SSCS), Amsterdam, the Netherlands, July 2007, pp.45-50.
  • 5Vergyri D, Stolcke A, Gadde R R, Wang W. The SRI 2006 spoken term detection system. In Proc. NIST Spoken Term Detection Workshop (STD 2006), Gaithersburg, USA, December 14-15, 2006.
  • 6Vergyri D, Shafran I, Stolcke A, Gadde R R, Akbacak M, Roark B, Wang W. The SRI/OGI 2006 spoken term detection system. In Proc. the 8th INTER SPEECH, Antwerp, Belgium, August 27-31, 2007, pp.2393-2396.
  • 7Akbacak M, Vergyri D, Stolcke A. Open-vocabulary spoken term detection using graphone-based hybrid recognition systems. In Proc. ICASSP 2008, Las Vegas, USA, March 31- April 4, 2008, pp.5240-5243.
  • 8Szoke I, Fapso M, Karafiat M, Burget L, Gn§zl F, Schwarz P, Glembek 0, Matejka P, Kopecky J, Cernocky J. Spoken term detection system based on combination of LVCSR and phonetic search. In Lecture Notes in Computer Science 4892, Popescn-Belis A, Bourlard H, Reanals S (eds.), Springer Berlin/Heidelberg, September 2008, pp.237-247.
  • 9Szoke I, Burget L, Ccrnocky J, Fapso M. Sub-word modeling of out of vocabulary words in spoken term detection. In Proc. IEEE Workshop on Spoken Language Technology (SLT2008), Goa, India, December 15-19, 2008, pp.273-276.
  • 10Szoke I, Fapso M, Burget L, Cernocky J. Hybrid wordsubword decoding for spoken term detection. In Proc. Speech Search Workshop at SIGIR (SSCS 2008), Singapore, Singapore, July 20-24, 2008, pp.42-48.

同被引文献27

  • 1NIST. The spoken term detection (STD) 2006 evalu- ation plan [EB/OL]. http://www, itl. nist. gov/ iad/mig/tests/std/2006/ docs/std06-evalplan-vl 0. pdf,2006-9-13.
  • 2Szoke !, Burget L, Cernocky J, et al. Sub-word modeling of out of vocabulary words in spoken term detection [C]//Proceedings of IEEE Workshop on Spoken Language Technology. Goa, India: IEEE, 2008 : 273-276.
  • 3Wallace R, Vogt R, Sridharan S. A phonetic search approach to the 2006 NIST spoken term detection e- valuation[C]//Proceedings of Interspeech. Antwerp. Belgium: IEEE, 2007: 2393-2396.
  • 4Rastrow A, Sethy A, Ramabhadran B, et al. To- wards using hybrid word and fragment units for vo- cabulary independent LVCSR systems[C]//Proc of Interspeecb. Brighton, UK: IEEE, 2009: 1931- 1934.
  • 5Larson M, EickEler S. Using syllable-based indexing features and language models to improve German spoken document retrieval[C]//Proceedings of Eu- rospeech. Geneva, Switzerland: IEEE, 2003: 1217- 1220.
  • 6Liu C, Wang D, Tejedor J. N-gram FST indexing for spoken term detection[C]//Proceedings of Inter- speech. Portland, Oregon, USA: IEEE, 2012.
  • 7Xu Y, Guo W, Shansu, et al. Spoken term detection for OOV terms based on phone fragment[C]//Pro-ceedings of International Conference on Audio, Lan- guage and Image Processing. Shanghai, China: IEEE, 2012:1031-1034.
  • 8Brummer N, Burget L, Cernocky J, et al. Fusion of heterogeneous speaker recognition systems in the ST- BU submission for the NIST speaker recognition e- valuation 2006 [J]. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(7): 2072-2084.
  • 9Bartlett S, Kondrak G, Cherry C. On the syllabifica- tion of phonemes[C]//Proceedings of the North A- merican Chapter of the Association for Computational Linguistics - Human Language Technologies. Boul- der, Colorado, USA: Association for Computational Linguistics, 2009 : 308-316.
  • 10Stolcke A. SRILM - An extensible language model- ing toolkit [C]// Proceedings of the International Conference of Spoken Language Processing. Denver, Colorado, USA: IEEE, 2002: 901-904.

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部