期刊文献+

双哈希索引的高精度大规模音频样例检索 被引量:1

Retrieval method of large scale audio samples based on Double hashing index
原文传递
导出
摘要 实时音频流中对大规模音频样例进行检索时,在保证准确率的条件下,检索速度直接影响音频流实时处理能力。提出一种基于双哈希索引的大规模音频样例检索方法。该方法通过对大规模音频样例的音频特征进行自相似量化后,分别根据自相似序列的分段向量均值和模值建立线性双哈希索引,然后在音频流中进行搜索,最后对搜索结果利用音频的时序和空间信息进行判断得到检索结果。实验结果表明,本方法实现了大规模音频样例的一次检索,且当采用12维MFCC音频特征,音频样例时长为16 s、音频样例规模小于3100时,音频样例的检索准确率在90%以上,检索速度大于12000倍速,最高达到16000倍速。该方法在有效提高检索精度的基础上,保证较高的检索速度。 The capacity of processing audio stream in real time is affected directly by the detection speed with detection accuracy guaranteed. A method based on double hashing index to test large-scale audio samples is proposed. The method first does weighted self-similarity to the audio feature, secondly establishes double linear hashing indexes to the mean and modulus of self-similarity sequence, then searches in the audio stream and judge the search results by temporal and spatial information to get the detection results. The results of experiments show that the method implements the one detection of large scale audio samples. The real time detection speed is above 12000 xRT, the largest detection speed is 16000 xRT, and the detection accuracy is above 90% when the duration of audio samples is 16 s and the number of audio samples is 3100. The method improves detection speed with higher detection accuracy guaranteed.
出处 《声学学报》 EI CSCD 北大核心 2015年第6期886-893,共8页 Acta Acustica
  • 相关文献

参考文献16

  • 1Yu Y,Joe K,Downie J.S.Efficient query-by-content audio retrieval by locality sensitive hashing and partial sequence comparison.Transactions on Information and Systems,2008;E91D(6):1730-1739.
  • 2Ryynanen M,Klapuri A.Query by humming of midi and audio using locality sensitive hashing.TEEE International Conference on Acoustics,Speech and Signal Processing,ICASSP,Las Vegas:IEEE,2008:2249-2252.
  • 3Baluja S,Covell M,Ioffe S.Permutation grouping:Intelligent hash function design for audio image retrieval.IEEE International Conference on Acoustics,Speech and Signal Processing,ICASSP,Las Vegas:IEEE,2008:2137-2140.
  • 4Zheng G,Li M.A fast audio retrieval method based on negativity judgment.2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing,Piscataway:IH-MSP,2009:1156-1159.
  • 5Yu Y,Crucianu M.Local summarization and multi-level LSH for retrieving multi-variant audio tracks.17th ACM International Conference on Multimedia,MM'09,with Colocated Workshops and Symposiums,Beijing:ACM,2009:341-350.
  • 6Cotton C,Ellis D P W.Finding similar acoustic events using matching pursuit and locality-sensitive hashing.2009IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.New Paltz:IEEE,2009:125-128.
  • 7唐杰.基于内容的音频检索技术研究.北京邮电大学,2010.
  • 8Guo Zhiyuan,Wang Qiang,Yin Liang et al.Query by humming via hierarchical filters.Tsukuba,2012:3021-3024.
  • 9Pedraza C,Vitola J,Sepulveda J et al.Fast content-based audio retrieval algorithm.Bogota,2013:1-5.
  • 10McFee B,Barrington L,Lanckriet G.Learning Content Similarity for Music Recommendation.Audio,Speech,and Language Processing,2012;8(20):2207-2218.

二级参考文献24

  • 1吕成国,韩纪庆,王承发.动态时间规正与差别子空间相结合的变异语音识别方法[J].声学学报,2005,30(3):229-234. 被引量:2
  • 2王成友,汤叔祺,梁甸农,陈辉煌,唐朝京.语音识别中多种特征信息综合利用的方法[J].声学学报,1997,22(2):111-115. 被引量:6
  • 3Hanesn J H L, Huang Rongqing. Speech Find: Advances in Spoken Document Retrieval for a National Gallery of the Spoken Word[J]. IEEE Transactions on Speech and Audio Processing, 2005, 13(5): 712-730.
  • 4Chechil G, Le E, Rehn M, et al. Large Scale Content Based Audio Retrieval from Text Queries[C]//Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval. New York, USA: ACM Press, 2008: 105-112.
  • 5Smith G, Murase H, Kashino K. Quick Audio Retrieval Using Active Search[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. New York, USA: IEEE Press, 1998: 3777-3780.
  • 6Kashino K, Kurozumi T, Murase H. A Quick Search Method for Audio and Video Signals Based on Histogram Pruning[J]. IEEE Transactions on Multimedia, 2003, 5(3): 384-357.
  • 7Kedem B. Spectral Analysis and Discrimination by Zero- crossings[J]. Proceedings of the IEEE, 1986, 74(11): 1477-1493.
  • 8Saunders J. Real-time Discrimination of Broadcast Speech Music[C]//Proceedings of IEEE ICASSP’96. [S. 1.]: IEEE Press, 1996: 993-996.
  • 9Li S Z. Content-based Classification and Retrieval of Audio Using the Nearest Feature Line Method[J]. IEEE Trans. on Speech Audio Processing, 2000, 8(5): 619-625.
  • 10Jonathan Foote, An overview of audio information retrieval. Multimedia Systems, 1999; 7(1): 2-11.

共引文献9

同被引文献11

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部