Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection 被引量：2

Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection

导出

摘要 An important component of a spoken term detection （STD） system involves estimating confidence measures of hypothesised detections.A potential problem of the widely used lattice-based confidence estimation,however,is that the confidence scores are treated uniformly for all search terms,regardless of how much they may differ in terms of phonetic or linguistic properties.This problem is particularly evident for out-of-vocabulary （OOV） terms which tend to exhibit high intra-term diversity.To address the impact of term diversity on confidence measures,we propose in this work a term-dependent normalisation technique which compensates for term diversity in confidence estimation.We first derive an evaluation-metric-oriented normalisation that optimises the evaluation metric by compensating for the diverse occurrence rates among terms,and then propose a linear bias compensation and a discriminative compensation to deal with the bias problem that is inherent in lattice-based confidence measurement and from which the Term Specific Threshold （TST） approach suffers.We tested the proposed technique on speech data from the multi-party meeting domain with two state-ofthe-art STD systems based on phonemes and words respectively.The experimental results demonstrate that the confidence normalisation approach leads to a significant performance improvement in STD,particularly for OOV terms with phonemebased systems. An important component of a spoken term detection （STD） system involves estimating confidence measures of hypothesised detections.A potential problem of the widely used lattice-based confidence estimation,however,is that the confidence scores are treated uniformly for all search terms,regardless of how much they may differ in terms of phonetic or linguistic properties.This problem is particularly evident for out-of-vocabulary （OOV） terms which tend to exhibit high intra-term diversity.To address the impact of term diversity on confidence measures,we propose in this work a term-dependent normalisation technique which compensates for term diversity in confidence estimation.We first derive an evaluation-metric-oriented normalisation that optimises the evaluation metric by compensating for the diverse occurrence rates among terms,and then propose a linear bias compensation and a discriminative compensation to deal with the bias problem that is inherent in lattice-based confidence measurement and from which the Term Specific Threshold （TST） approach suffers.We tested the proposed technique on speech data from the multi-party meeting domain with two state-ofthe-art STD systems based on phonemes and words respectively.The experimental results demonstrate that the confidence normalisation approach leads to a significant performance improvement in STD,particularly for OOV terms with phonemebased systems.

作者 Javier Tejedo Simon King Joe Frankel

机构地区 Centre for Speech Technology Research Human Computer Technology Laboratory (HCTLab)

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第2期358-375,共18页 计算机科学技术学报（英文版）

关键词 confidence estimation discriminative model spoken term detection speech recognition confidence estimation,discriminative model,spoken term detection,speech recognition

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献68

1Mamou J, Ramabhadran B, Siohan O. Vocabulary independent spoken term detection. In Proc. the 30th ACM-SIGIR, Amsterdam, the Netherlands, July 23-27, 2007, pp.615-622.
2Mamou J, Ramabhadran B. Phonetic query expansion for spoken document retrieval. In Proc. the 9th INTERSPEECH, Brisbane, Australia, September 22-26, 2008, pp.2106-2109.
3Can D, Cooper E, Sethy A, White C, Ramabhadran B, Saraclar M. Effect of pronunciations on OOV queries in spoken term detection. In Proc, ICASSP 2009, Taipei, China, April 19-24, 2009, pp.3957-3960.
4Fiscus J G, Ajot J, Garofolo J S, Doddingt.ion G. Results of the 2006 spoken term detection evaluation. In Proc. Workshop on Searching Spontaneous Conversational Speech (SIGIR-SSCS), Amsterdam, the Netherlands, July 2007, pp.45-50.
5Vergyri D, Stolcke A, Gadde R R, Wang W. The SRI 2006 spoken term detection system. In Proc. NIST Spoken Term Detection Workshop (STD 2006), Gaithersburg, USA, December 14-15, 2006.
6Vergyri D, Shafran I, Stolcke A, Gadde R R, Akbacak M, Roark B, Wang W. The SRI/OGI 2006 spoken term detection system. In Proc. the 8th INTER SPEECH, Antwerp, Belgium, August 27-31, 2007, pp.2393-2396.
7Akbacak M, Vergyri D, Stolcke A. Open-vocabulary spoken term detection using graphone-based hybrid recognition systems. In Proc. ICASSP 2008, Las Vegas, USA, March 31- April 4, 2008, pp.5240-5243.
8Szoke I, Fapso M, Karafiat M, Burget L, Gn§zl F, Schwarz P, Glembek 0, Matejka P, Kopecky J, Cernocky J. Spoken term detection system based on combination of LVCSR and phonetic search. In Lecture Notes in Computer Science 4892, Popescn-Belis A, Bourlard H, Reanals S (eds.), Springer Berlin/Heidelberg, September 2008, pp.237-247.
9Szoke I, Burget L, Ccrnocky J, Fapso M. Sub-word modeling of out of vocabulary words in spoken term detection. In Proc. IEEE Workshop on Spoken Language Technology (SLT2008), Goa, India, December 15-19, 2008, pp.273-276.
10Szoke I, Fapso M, Burget L, Cernocky J. Hybrid wordsubword decoding for spoken term detection. In Proc. Speech Search Workshop at SIGIR (SSCS 2008), Singapore, Singapore, July 20-24, 2008, pp.42-48.

同被引文献27

1NIST. The spoken term detection (STD) 2006 evalu- ation plan [EB/OL]. http://www, itl. nist. gov/ iad/mig/tests/std/2006/ docs/std06-evalplan-vl 0. pdf,2006-9-13.
2Szoke !, Burget L, Cernocky J, et al. Sub-word modeling of out of vocabulary words in spoken term detection [C]//Proceedings of IEEE Workshop on Spoken Language Technology. Goa, India: IEEE, 2008 : 273-276.
3Wallace R, Vogt R, Sridharan S. A phonetic search approach to the 2006 NIST spoken term detection e- valuation[C]//Proceedings of Interspeech. Antwerp. Belgium: IEEE, 2007: 2393-2396.
4Rastrow A, Sethy A, Ramabhadran B, et al. To- wards using hybrid word and fragment units for vo- cabulary independent LVCSR systems[C]//Proc of Interspeecb. Brighton, UK: IEEE, 2009: 1931- 1934.
5Larson M, EickEler S. Using syllable-based indexing features and language models to improve German spoken document retrieval[C]//Proceedings of Eu- rospeech. Geneva, Switzerland: IEEE, 2003: 1217- 1220.
6Liu C, Wang D, Tejedor J. N-gram FST indexing for spoken term detection[C]//Proceedings of Inter- speech. Portland, Oregon, USA: IEEE, 2012.
7Xu Y, Guo W, Shansu, et al. Spoken term detection for OOV terms based on phone fragment[C]//Pro-ceedings of International Conference on Audio, Lan- guage and Image Processing. Shanghai, China: IEEE, 2012:1031-1034.
8Brummer N, Burget L, Cernocky J, et al. Fusion of heterogeneous speaker recognition systems in the ST- BU submission for the NIST speaker recognition e- valuation 2006 [J]. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(7): 2072-2084.
9Bartlett S, Kondrak G, Cherry C. On the syllabifica- tion of phonemes[C]//Proceedings of the North A- merican Chapter of the Association for Computational Linguistics - Human Language Technologies. Boul- der, Colorado, USA: Association for Computational Linguistics, 2009 : 308-316.
10Stolcke A. SRILM - An extensible language model- ing toolkit [C]// Proceedings of the International Conference of Spoken Language Processing. Denver, Colorado, USA: IEEE, 2002: 901-904.

引证文献2

1熊世富,郭武.多流信息融合的集外词检索[J].数据采集与处理,2014,29(2):274-279.
2王朋,屈丹,张文林.基于ATWV优化和偏差补偿的词相关置信度规整[J].信息工程大学学报,2015,16(6):711-717. 被引量：1

二级引证文献1

1张伟涛,米吉提·阿不里米提,郑方,艾斯卡尔·艾木都拉.基于深度神经网络的资源匮乏语言语音关键词检索[J].现代电子技术,2022,45(11):68-72. 被引量：5

1蔡铁,朱杰.自动语音识别系统中的OOV快速拒识算法[J].计算机工程,2005,31(10):22-24. 被引量：2
2朱玲.因特网与性传播疾病[J].医学信息（医学与计算机应用）,2001,14(10):684-685.
3周志辉,闫云霞.GYK型轨道车运行控制设备的应用[J].铁道通信信号,2010,46(10):35-38. 被引量：7
4曹健.互联不互通，受苦是用户[J].IT时代周刊,2005(7):2-2.
5张琳.关于降低高速公路机电系统维护维修成本的研究[J].交通世界,2016(21):128-129. 被引量：2
6本田6月在日本全面复产预计因地震致全球销量减少6％[J].世界汽车,2011(7):157-157.
7张建成,周鸣乐,董火民,徐梅.浅谈软件项目管理[J].信息技术与信息化,2008(5):84-86. 被引量：10
8姜建,吴宏建.网络功能虚拟化关键技术及影响分析[J].电信网技术,2014(12):1-5. 被引量：2
9刘卓辉.浅析智能电网[J].消费电子,2012(10X):15-15.
10姚健,俞晓明,刘悦,程学旗,程工,刘春阳.基于Web的查询翻译中OOV译文挖掘优化[J].山西大学学报（自然科学版）,2015,38(1):1-7.

Journal of Computer Science & Technology

2012年第2期

浏览历史

内容加载中请稍等...

Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection 被引量：2

参考文献68

同被引文献27

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史