期刊文献+

基于音节Lattice的汉语语音检索技术及其索引去冗余方法 被引量:7

Syllable lattice based Chinese speech retrieval techniques and removing redundancy method from indices
下载PDF
导出
摘要 针对网络中越来越多的语音数据,人们迫切地需要基于语义内容的快速、准确的语音检索技术。本文在基于音节Lattice的汉语语音检索研究中,针对传统的向量空间模型检索方法的不足,提出了一种基于词检出实现的语音检索方法。并针对Lattice索引存在的信息冗余问题,提出了一种基于音节后验概率直方图的索引去冗余方法。实验结果表明,本文的检索方法在性能上明显优于向量空间模型方法;而提出的索引去冗余方法达到了大规模缩减索引尺寸加快检索速度的目的。 Nowadays, the amount of spoken data becomes much larger on Internet. Thus content based, rapid and accurate speech retrieval techniques are desired. In the research of syllable lattice based Chinese speech retrieval, a retrieval method based on keyword spotting techniques is present, instead of the method based on vector space model. Then, a removing redundancy method is also proposed, which can distinguish useful information from redundant information by a syllable posterior probability histogram and then remove redundancy from lattice indices. Experiment shows that. our retrieval method has much better performances than the method based on vector space model. Moreover, smaller indices size and faster searching speed are acquired by using the removing redundancy method.
出处 《声学学报》 EI CSCD 北大核心 2008年第6期526-533,共8页 Acta Acustica
基金 国家自然科学基金(60575030) 国家重点基础研究发展计划(2007CB311104)资助项目
关键词 LATTICE 汉语语音 冗余方法 检索技术 索引 音节 向量空间模型 检索方法 Image retrieval Information retrieval Quality assurance Reliability Speech Vectors
  • 相关文献

参考文献19

  • 1Abberley D, Renals S, Cook G. Retrieval of broadcast news documents with the THISL system. In: Proc. ICASSP98, Seattle, 1998:3781--3784
  • 2Beth L, Pedro M, Om D. Word and subword indexing approaches for reducing the effects of OOV queries on spoken audio. In: Proc. HLT2002, San Diego, 2002
  • 3Ng K, Zue V W. Subword-based approaches for spoken document retrieval. Speech Communication, 2000; 32: 157-- 186
  • 4Wechsler M, Schauble P. Speech retrieval based on automatic indexing. In: Proc.MIRO '95, Glasgow, 1995
  • 5Foote J T et al. Unconstrained keyword spotting using phone lattices with application to spoken document retrieval. Computer Speech and Language, 1997; 2: 207-- 224
  • 6Cardillo P S, Clements M, Miller M S. Phonetic searching vs. LVCSR: How to find what you really want in audio archives. International Journal of Speech Technology, 2002; 5:9--22
  • 7Seide F et al. Vocabulary independent search in spontaneous speech. In: Proc. ICASSP'04, Montreal, 2004: I253--I256
  • 8Yu P, Seide P. A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech. In: Proc. INTERSPEECH-2004, 3eju Island, korea, 2004:293--296
  • 9Woodland P C et al. Effects of out of vocabulary words in spoken document retrieval. In: Proe. SIGIR, Athens, Greece, 2000:372--374
  • 10Wang Hsin-min. Experiments in syllable-based retrieval of broadcast news speech in mandarin Chinese. Speech Communication, 2000; 32:49-60

二级参考文献18

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2陈振标,徐波.基于子带能量特征的最优化语音端点检测算法研究[J].声学学报,2005,30(2):171-176. 被引量:22
  • 3Liu F H,Proc IEEE Int Conf Acoust Speech Signal Processing,1994年,61页
  • 4程云鹏,矩阵论,1989年
  • 5Dong Yu,Proc EUROSPEECH’95,477页
  • 6张希军,软件学报,1996年,863专刊
  • 7Tian Y, Wu J, Wang Z Y et al. Fuzzy clustering and Bayesian information criterion based threshold estimation for robust voice activity detection. In: Proc. of ICASSP'03, 2003(1): 1444-1447
  • 8Kondoz A, Cho Y D. Analysis and improvement of a statistical model-based voice activity detector. IEEE Signal Processing Letters, 2001; 8(10): 276-278
  • 9Yang C, Soong F K, Lee T. Noise robustness of dynamic and static features for continuous Cantonese digit recognition. In: Proc. of ISCSLP'04, 2004(1): 277-280
  • 10Shinoda K, Lee C H. A structural Bayes approach to speaker adaptation. IEEE Trans. on Speech and Audio Processing, 2001; 9(3): 276-287

共引文献20

同被引文献68

  • 1罗骏,欧智坚.一种高效的语音关键词检索系统[J].通信学报,2006,27(2):113-118. 被引量:9
  • 2杨琳,张建平,颜永红.特定领域的汉语语言模型平滑算法比较研究[J].计算机工程与应用,2006,42(32):14-16. 被引量:5
  • 3Good I J. The Population Frequencies of Species and the Estimation of Population Parameters. Biometrika, 1953, 40 (3/4) : 237 - 264.
  • 4Katz S M. Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Trans on Acoustics, Speech, and Signal Processing, 1987, 35 ( 3 ) : 400 - 401.
  • 5Gale W A, Sampson G. Good-Turing Frequency Estimation without Tears. Quantitative Linguistics, 1995, 2:217-237.
  • 6Abberley D, Renals S, Cook G. Retrieval of broadcast news documents with the THISL system [ C ]//Proc. ICASSP98, Seattle, 1998 : 3781 - 3784.
  • 7Yuichi Yaguchi, Keitaro Naruse, Ryuichi Oka. Fast Spotter : An Approximation Algorithm for Continuous Dynamic Programming [ C ]//IEEE CIT 2008. July 2008:583 -588.
  • 8Yuichi Yaguchi, Yoshiyuki Watanabe, Keitaro Naruse, et al. Speech and Song Search on the Web : System Design and Implementation [ C ]//CIT 2007 ,Oct. 2007:270 - 278.
  • 9Stefan Ortmanns, Hermann Ney. The Time-Conditioned Approach in Dynamic Programming Search for LVCSR [ J ]. IEEE Transaction on speech and audio processing,2000,8(6) :676-687.
  • 10Lawrence R Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition [ J]. Proceeding of the IEEE, 1989,77(2) :257-285.

引证文献7

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部