期刊文献+

一种基于倒排索引的音频检索方法 被引量:8

An Inverted Index Based Audio Retrieval Method
下载PDF
导出
摘要 传统的基于实例的音频检索算法采用顺序索引,检索时需遍历数据库并导致难以忍受的等待时间。针对传统的顺序的索引方法,该文提出基于倒排索引的音频检索算法。该方法首先利用多种音频特征构成的超向量,通过多层音频分割方法将连续音频流分割为特征数值波动幅度小的短时音频段;然后利用事先训练好的音频字典,将短时音频段序列转换为可以表征音频内容的音频字序列,并建立倒排索引;检索时,将用户提交的查询转换为音频字后利用倒排索引无须遍历数据库即可直接定位候选段落,并根据候选段落与查询的内容相似度大小对候选段落进行排序,将排好序的列表作为检索结果。仿真实验以匹配项排名、同类检索结果比例、定位准确性和检索用时4个方面作为评价指标,实验结果显示,该算法能够在平均1.101 s时间内实现92.58%的检索准确率。 Traditional example based audio retrieval algorithms use forward index, with which, retrieval processing need to traverse the whole database, resulting in intolerable response time. This paper proposes an inverted-index based audio retrieval method. Through constructing super-vector comprising several audio features, audio stream is first segmented into short segments with small feature fluctuation; Based on a pre-trained audio word dictionary, short audio segment sequence is then transformed into audio word sequence, from which inverted index is constructed; During the retrieval phase, the query audio sample is transformed into audio words and retrieval is carried out, candidate segments are ranked according to the similarity with the query. Match term ranking, same type ratio, overlap ratio and retrieval time are used to evaluate the performance of the proposed algorithm. The experiment gives 92.58% retrieval precision within average response time of 1.101 s.
出处 《电子与信息学报》 EI CSCD 北大核心 2012年第11期2561-2567,共7页 Journal of Electronics & Information Technology
基金 国家自然科学基金(60972132 61101160)资助课题
关键词 音频信号处理 音频检索 内容相似度 倒排索引 Audio signal processing Audio retrieval Content similarity Inverted index
  • 相关文献

参考文献15

  • 1Heryanto H, Akbar S, and Sitohang B. Direct access in content-based audio information retrieval: a state of the art and challenges[C]. 2011 International Conference on Electrical Engineering and Informatics, Bandung, Indonesia, July 17-19, 2011: 1-6.
  • 2Ghoraani B and Krishnan S. Time-frequency matrix feature extraction and classification of environmental audio signals[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7): 2197-2209.
  • 3Fu Zhou-yu, Lu Guo-jun, Ting Kai-ming, et al.. Music classification via the bag-of-features approach[J]. Pattern Recognition Letters, 2011, 32(14): 1768-1777.
  • 4Su Ja-hwung, Wu Cheng-we, Fu Shao-yu, et al.. Empirical analysis of content-based music retrieval for music identification[C]. 2011 International Conference on Multimedia Technology, Hangzhou, China, July 26-28, 2011: 3516-3519.
  • 5Jurkas P, Stefina M, Novak D, et al.. Audio similarity retrieval engine[C]. Proceedings of the Third International Conference on Similarity Search and Applications, Istanbul, Turkey,Sep. 18-19, 2010: 121-122.
  • 6Kashino K, Kurozumi T, and Murase H. A quick search method for audio and video signals based on histogram pruning[J]. IEEE Transactions on Multimedia, 2003, 5(3): 348-357.
  • 7Matthews B, Chaudhari U, and Ramabhadran B. Fast audio search using vector space modeling[C]. IEEE Workshop on Automatic Speech Recognition & Understanding, Kyoto, Japan, Dec. 9-13, 2007: 641-646.
  • 8Cha Guang-ho. An effective and efficient indexing scheme for audio fingerprinting[C]. 5th FTRA International Conference on Multimedia and Ubiquitous Engineering, Loutraki, Greece June 28-30, 2011: 48-52.
  • 9Bardeli R. Similarity search in animal sound databases[J]. IEEE Transactions on Multimedia, 2009, 11(1): 68-76.
  • 10黄少林,王华,张玉红,蒋一峰.基于Lucene的索引系统的设计与实现[J].现代情报,2009,29(7):169-171. 被引量:11

二级参考文献4

共引文献10

同被引文献56

  • 1陶雪娇,胡晓峰,刘洋.大数据研究综述[J].系统仿真学报,2013,25(S1):142-146. 被引量:338
  • 2田华娟.在数据结构中如何使用二分查找[J].中国科技信息,2005(5):136-136. 被引量:1
  • 3Ghulam Muhammad,Khaled Alghathbar. Environment recognition from audio using mprg-7 features[A].2009.1-6.
  • 4Malik H,Farid H. Audio forensics from acoustic reverberation[A].2010.1710-1713.
  • 5Ikram S,Malik H. Digital audio forensics using background noise[A].2010.106-110.
  • 6Kraetzer C,Oermann A,Dittmann J. A digital audio forensics:A first practical evaluation on microphone and environment classification[A].2007.63-74.
  • 7Bucholz R,Kraetzerr C,Dittmann J. Microphone classification using fourier coefficients[A].2009.236-246.
  • 8Kraetzer C,Dittmann J. Mel-cepstrum based steganalysis for voIPsteganography[A].2007.6505.
  • 9Ngai Ewt,Hu Yong,Wong Yh. The application of data mining techniques in financial fraud detection:A classification framework and an academic review of literature[J].{H}Decision Support Systems,2011,(03):559-569.
  • 10Uri Nodelman,Christian R,Daphne Koller. Expectation maximization and complex duration distributions for continuous time bayesian networks[A].2012.421-430.

引证文献8

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部