期刊文献+

一种基于指纹因子的鲁棒音频检索方法 被引量:3

Robust Audio Retrieval Method Based on Fingerprint Factors
下载PDF
导出
摘要 针对基于内容的音频检索中由于噪声造成的查找失败问题,本文提出了一种对噪声鲁棒的基于音频指纹因子的音频特征提取算法和一种半监督的音频字典训练算法,以提高噪声下音频检索的精度。本文方法从Mel谱中提取音频指纹,利用非负矩阵分解算法将指纹分解为对噪声鲁棒的频率因子和时间因子作为特征。同时通过提出的半监督音频字典训练算法进行音频字典训练,本文方法使用音效集计算基本音效的分布空间作为初始字典,在量化数据的同时动态更新字典以实现对数据的准确描述。实验结果表明,在低信噪比条件下本文提出的算法的平均查询精度明显高于其他算法。 A noise-robust fingerprint-factor-based audio feature and a semi-supervised audio dictionary training algorithm are proposed to fill up the deficiency caused by noise in content-based audio retrieval.The proposed method extracts audio fingerprint from Mel spectra and utilizes non-negative matrix factorization to factorize fingerprint into noise-robust spectral factor and temporal factor as features.Also an semi-supervised audio dictionary training algorithm is proposed.It uses an audio effect set to calculate the distribution of basic sound effects as initialized dictionary.The quantization is conducted while the dictionary is dynamically updated at the same time to better characterize data.The experimental results show that under low signal-to-noise ratio(SNR),the proposed method significantly improves the average precision compared with other algorithms.
出处 《数据采集与处理》 CSCD 北大核心 2016年第5期1020-1027,共8页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61301300)资助项目
关键词 音频检索 音频指纹 非负矩阵分解 音频字典 倒排索引 audio retrieval audio fingerprint non-negative matrix factorization audio dictionary inverted index
  • 相关文献

参考文献18

  • 1Weng L,Amsaleg L,Morton A,et al.A privacy-preserving framework for large-scale content-based information retrieval[J].Information Forensics and Security,IEEE Transactions on,2015,10(1):152-167.
  • 2Awad G,Michel M,Joy D,et al.Evaluation campaigns and TRECVid[EB/OL].http://trecvid.nist.gov/,2015-05-01.
  • 3Wang Y,Mohammed B,Bashar T.Near-duplicate video retrieval based on clustering by multiple sequence alignment[C]∥Proceedings of the 20th ACM International Conference on Multimedia.Nara,Japan:ACM,2012:941-944.
  • 4Huurnink B,Snoek M,de Rijke M,et al.Content-based analysis improves audiovisual archive retrieval[J].IEEE Transac-tions on Multimedia,2012,14(4):1166-1178.
  • 5Haitsma J,Kalker T.A highly robust audio fingerprinting system[C]∥3rd International Conference on Music InformationRetrieval.Paris,France:IRCAM,2002:107-115.
  • 6Shi Jianhua,Yu Xiaoqing,Wang Yunhui,et al.Noise reduction based on nearest neighbor estimation for audio feature ex-traction[C]∥International Conference on Audio,Language and Image Processing.Shanghai,China:the Institute of Electri-cal and Electronics Engineers Press,2012:768-771.
  • 7Malekesmaeili M,Ward K.A novel local audio fingerprinting algorithm[C]∥14th International Workshop on MultimediaSignal Processing.Banff,Canada:the Institute of Electrical and Electronics Engineers Press,2012:136-140.
  • 8Kimura A,Kashino K,Kurozumi T,et al.A quick search method for audio signals based on a piecewise linear representationof feature trajectories[J].IEEE Transactions on Audio,Speech,and Language Processing,2008,16(2):396-407.
  • 9Kashino K,Kurozumi T,Murase H.A quick search method for audio and video signals based on histogram pruning[J].IEEE Transactions on Multimedia,2003,5(3):348-357.
  • 10Zhao L,Wu X,Ngo W.On the annotation of web videos by efficient near-duplicate search[J].IEEE Transactions on Multi-media,2010,12(5):448-461.

二级参考文献33

  • 1双志伟,张世磊,秦勇.语音转换分析及相似度改进[J].清华大学学报(自然科学版),2009(S1):1408-1412. 被引量:3
  • 2Heryanto H, Akbar S, and Sitohang B. Direct access in content-based audio information retrieval: a state of the art and challenges[C]. 2011 International Conference on Electrical Engineering and Informatics, Bandung, Indonesia, July 17-19, 2011: 1-6.
  • 3Ghoraani B and Krishnan S. Time-frequency matrix feature extraction and classification of environmental audio signals[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7): 2197-2209.
  • 4Fu Zhou-yu, Lu Guo-jun, Ting Kai-ming, et al.. Music classification via the bag-of-features approach[J]. Pattern Recognition Letters, 2011, 32(14): 1768-1777.
  • 5Su Ja-hwung, Wu Cheng-we, Fu Shao-yu, et al.. Empirical analysis of content-based music retrieval for music identification[C]. 2011 International Conference on Multimedia Technology, Hangzhou, China, July 26-28, 2011: 3516-3519.
  • 6Jurkas P, Stefina M, Novak D, et al.. Audio similarity retrieval engine[C]. Proceedings of the Third International Conference on Similarity Search and Applications, Istanbul, Turkey,Sep. 18-19, 2010: 121-122.
  • 7Kashino K, Kurozumi T, and Murase H. A quick search method for audio and video signals based on histogram pruning[J]. IEEE Transactions on Multimedia, 2003, 5(3): 348-357.
  • 8Matthews B, Chaudhari U, and Ramabhadran B. Fast audio search using vector space modeling[C]. IEEE Workshop on Automatic Speech Recognition & Understanding, Kyoto, Japan, Dec. 9-13, 2007: 641-646.
  • 9Cha Guang-ho. An effective and efficient indexing scheme for audio fingerprinting[C]. 5th FTRA International Conference on Multimedia and Ubiquitous Engineering, Loutraki, Greece June 28-30, 2011: 48-52.
  • 10Bardeli R. Similarity search in animal sound databases[J]. IEEE Transactions on Multimedia, 2009, 11(1): 68-76.

共引文献18

同被引文献12

引证文献3

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部