基于子词PSPL的汉语语音文档索引

Subword-Based Position Specific Posterior Lattices for Chinese Spoken Document Indexing

下载PDF

导出

摘要针对汉语语音文档检索中最优识别单元和检索单元不一致的问题,提出一种基于子词(position specificposterior lattices,PSPL)的语音文档索引方法;该方法以词为识别单元对语音文档进行解码,得到PSPL:然后对PSPL进行子词切分,并根据子词弧与原始词弧的后验概率关系,将PSPL转换为相应的子词PSPL,以子词PSPL为索引进行查询项检索.实验结果表明,所提出的方法在利用丰富语言信息的同时,解决了词解码器存在的边界分割不正确的问题,检索性能明显优于目前普遍使用的识别单元和检索单元均为词的PSPL索引方法. A spoken document indexing method based on subword-based position specific posterior lattices （S- PSPL） is proposed to overcome inconsistency between optimal recognition unit and retrieval unit in the existing Chinese spoken document indexing methods. In the proposed method, a word-based PSPL is generated with a word-based speech recognizer. Each word in the PSPL is replaced by its constituent subword units. According to the posterior probability relationship between each word and its constituent subword units, the original PSPL can be converted to the corresponding S-PSPL to be used in generating a subword-based index for retrieval. Experimental results show that the new method can make use of a well-trained language model, and avoid incorrect segmentation in the word-based recognizer as well. Better performance is obtained compared to the current indexing methods that use words as both recognition and retrieval units.

作者陆明明张连海屈丹

机构地区解放军信息工程大学信息工程学院

出处《应用科学学报》 CAS CSCD 北大核心 2013年第3期259-265,共7页 Journal of Applied Sciences

基金国家自然科学基金(No.61175017)资助

关键词语音文档检索语音文档索引子词PSPL 词格子词后验概率 spoken document retrieval, spoken document indexing, subword-based position specific posterior lattices, lattice, subword posterior probability

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献14

1ZHENG Tieran, HAN Jiqing. Chinese spoken docu- ment retrieval based on syllable neighbor posterior probability matrix [C]//IEEE International Confer- ence on Audio, Language and Image Processing, 2008: 1209-1213.
2Garofolo J, Auzanne G. The TREC spoken docu- ment retrieval track: a success story [J]. Bulletin of the American Society for Information Science and Technology, 2000, 26(5): 18-37.
3倪崇嘉,刘文举,徐波.汉语大词汇量连续语音识别系统研究进展[J].中文信息学报,2009,23(1):112-123. 被引量：39
4郑铁然,韩纪庆.基于音节Lattice的汉语语音检索技术及其索引去冗余方法[J].声学学报,2008,33(6):526-533. 被引量：7
5郑铁然,韩纪庆,李海洋.基于词片的语言模型及在汉语语音检索中的应用[J].通信学报,2009,30(3):84-88. 被引量：5
6LEE H Y, Tu T W, CHEN C P. Improved spoken term detection using support vector machines based on lattice context consistency [C]//IEEE International Conference on Acoustics, Speech and Signal Process- ing, 2011: 5648-5651.
7CHELBA C, ACERO A. Position specific posterior lattices for indexing speech [C]//The 43rd Annual Meeting on Association for Computational Linguis- tics, 2005: 443-450.
8CHELBA C, HAZEN T J, SARACLAR M. Retrieval and browsing of spoken content [J]. IEEE Signal Process- ing Magazine, 2008, 25(3): 39-49.
9MENG C H, LEE H Y, LEE L S. Improved lattice- based spoken document retrieval by directly learn- ing from the evaluation measures [C]//IEEE Inter- national Conference on Acoustics, Speech and Signal Processing, 2009: 4893-4896.
10LIN S H, CHEN B. Improved speech summariza- tion with multiple-hypothesis representations andKullback-Leibler divergence measures [C]//The 10th Annual Conference of the International Speech Com- munication Association, 2009: 1847-1850.

二级参考文献127

1孟莎,余鹏,Frank Seide,刘加.基于后验概率词格的汉语自然对话语音索引[J].清华大学学报（自然科学版）,2008,48(S1):673-677. 被引量：2
2钱跃良,林守勋,刘群,刘宏.2005年度863计划中文信息处理与智能人机接口技术评测回顾[J].中文信息学报,2006,20(B03):1-6. 被引量：4
3周梁,高鹏,丁鹏,徐波.语音识别准确率与检索性能的关联性研究[J].中文信息学报,2006,20(3):99-104. 被引量：2
4王欢良,韩纪庆,李海峰.基于特征似然度加权和维数缩减的Robust语音端点检测[J].声学学报,2007,32(1):62-68. 被引量：7
5Abberley D, Renals S, Cook G. Retrieval of broadcast news documents with the THISL system. In: Proc. ICASSP98, Seattle, 1998:3781--3784
6Beth L, Pedro M, Om D. Word and subword indexing approaches for reducing the effects of OOV queries on spoken audio. In: Proc. HLT2002, San Diego, 2002
7Ng K, Zue V W. Subword-based approaches for spoken document retrieval. Speech Communication, 2000; 32: 157-- 186
8Wechsler M, Schauble P. Speech retrieval based on automatic indexing. In: Proc.MIRO '95, Glasgow, 1995
9Foote J T et al. Unconstrained keyword spotting using phone lattices with application to spoken document retrieval. Computer Speech and Language, 1997; 2: 207-- 224
10Cardillo P S, Clements M, Miller M S. Phonetic searching vs. LVCSR: How to find what you really want in audio archives. International Journal of Speech Technology, 2002; 5:9--22

共引文献52

1张威.口译语料库的开发与建设:理论与实践的若干问题[J].中国翻译,2009,30(3):54-59. 被引量：48
2孟莎,刘加.汉语语音检索的集外词问题与两阶段检索方法[J].中文信息学报,2009,23(6):91-97. 被引量：8
3张田,李嵩,高畅,邱荣发,李海峰.基于音频的数字媒体内容分析及其可视化[J].燕山大学学报,2010,34(2):100-105.
4郭瑞.基于ASR的呼叫中心系统设计与可靠性研究[J].环境技术,2010,28(2):34-38.
5杨微.面向俄罗斯学生的汉语语音教学[J].黑龙江科技信息,2010(18):196-196.
6黄子君,张亮.语音识别技术及应用综述[J].江西教育学院学报,2010,31(3):44-46. 被引量：10
7陈焱.基于直觉模糊推理方法的相关方向研究热点等级判断研究[J].情报杂志,2010,29(10):54-58.
8李伟,吴及,吕萍.面向海量数据的语音敏感信息检测系统[J].信息工程大学学报,2010,11(5):544-548. 被引量：2
9王霅煜,涂惠燕.基于内容的语音课件关键词检索系统:设计与实现[J].计算机应用与软件,2011,28(4):120-123. 被引量：1
10张磊,陆冬,项学智.改进的Katz算法及其在基于Lattice识别系统中的应用[J].模式识别与人工智能,2011,24(2):249-254.

1陆明明,张连海,屈丹,牛铜.一种融合音位属性的语音文档索引方法[J].计算机工程,2012,38(19):159-162.
2陆明明,张连海,牛铜.基于音位属性检测的PSPL改进方法[J].信息工程大学学报,2012,13(4):426-431.
3姚雅鹃.XML数据路径索引技术研究与分析[J].计算机与数字工程,2011,39(1):158-162. 被引量：1
4姚雅鹃.XML路径索引技术及应用研究[J].北京信息科技大学学报（自然科学版）,2010,25(S2):79-83.
5曹强.基于Lucene的Web站点站内全文检索系统的设计与实现[J].图书情报工作,2007,51(9):124-126. 被引量：10
6颜永红.音频信息识别与检索技术[J].现代物理知识,2009,21(3):11-14. 被引量：2
7郑铁然,韩纪庆.汉语语音文档检索中后验概率的索引方法[J].哈尔滨工业大学学报,2009,41(8):97-102.
8殷国军,秦莉.图像分割方法研究综述[J].河北工程技术高等专科学校学报,2009(2):9-11. 被引量：3
9Steve,Alexander,赵锋.构造智能光互联网[J].通讯世界,2001,7(4):17-18.
10张立.禁止Excel将URL转换为超级链接[J].大众软件,2003(7):83-83.

应用科学学报

2013年第3期

浏览历史

内容加载中请稍等...

基于子词PSPL的汉语语音文档索引

参考文献14

二级参考文献127

共引文献52

相关作者

相关机构

相关主题

浏览历史