期刊文献+

基于VP树结构的多层匹配算法在哼唱识别中的应用 被引量:4

VP-tree based multi-stage matching algorithm for query-by-humming systems
原文传递
导出
摘要 哼唱识别是音频检索的一个重要应用,其难点是音频歌曲数据的非结构性以及搜索速度和准确率平衡的问题。该文提出了新的数据库构造方法,将音频歌曲集用手工标注方法提取主旋律并且按自然演唱停顿方式进行分段,采用段落结构而不是整首歌作为索引。同时,提出了一种基于VP树的搜索结构以及相应的多级搜索算法,在快速匹配层采用粗搜索算法,在精确匹配层采用基于动态时间规整算法。实验证明,在对检出率影响不大的前提下,识别速度提高了40%以上。 Query by humming(QBH) is an important application for musical information retrieval.The key challenges in QBH are the unstructured data modules in audio songs and the balance between searching speed and accuracy.This paper presents a data structure for audio songs using a hand labeling method to label the melody and to divide the songs into natural segments.The search index uses the segmentation structure rather than the entire lyrics for the song.The system generates a VP-tree search structure with a multi-level searching algorithm that includes coarse searching for fast match and dynamic time warping(DTW) that leads to a fine match.Evaluations with 2 213 melody segments reduce the search time by over 40% without greatly reducing the recognition accuracy.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2009年第S1期1419-1424,共6页 Journal of Tsinghua University(Science and Technology)
基金 IBM与清华大学合作项目(2007-2008)
关键词 检索机 哼唱识别 VP树 动态时间规整 musical information retrieval query by humming VP-tree dynamic time warping
  • 相关文献

参考文献8

  • 1Ada Wai-chee Fu,Polly Mei-shuen Chan,Yin-Ling Cheung,Yiu Sang Moon.Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances[J]. The VLDB Journal . 2000 (2)
  • 2Ghias A,Logan J,Chamberlin D,et al.Query by humming:musical information retrieval in an audio database. ACM Multimedia 95-Electronic Proceedings . 1995
  • 3Wu X,Li M,Liu J,et al.A top-down approach to melodymatch in pitch contour for query by humming. Proc ofInternational Symposium of Chinese Spoken LanguageProcessing . 2006
  • 4Kruskal J B.An overview of sequence comparison:Timewarps,string edits,and macromolecules. . 1983
  • 5Dannenberg R B,Hu N.Understanding search performancein query-by-humming systems. Proc of InternationalSymposium of Music Information Retrieval . 2004
  • 6Jyh-Shing Roger Jang.DTW for Melody Recognition. http://neural.cs.nthu.edu.tw/jang/ .
  • 7Wikipedia.Octave. http://en.wikipedia.org/wiki/Octave .
  • 8Fu,Ada Wai-Chee,Chan,Polly Mei Shuen,Cheung,Yin-Ling,Moon,Yiu Sang.Dynamic vp-Tree indexing for n-nearest neighbor search given pair-wise distances. The VLDB Journal . 2000

同被引文献43

  • 1杜志然,周萍,景新幸,李杰.基于谱熵的耳语音增强研究[J].传感器与微系统,2012,31(6):69-72. 被引量:3
  • 2李晔,张仁智,崔慧娟,唐昆.低信噪比下基于谱熵的语音端点检测算法[J].清华大学学报(自然科学版),2005,45(10):1397-1400. 被引量:37
  • 3陈涛,谢阳群.文本分类中的特征降维方法综述[J].情报学报,2005,24(6):690-695. 被引量:79
  • 4朱浩冰,郭东辉.声纹识别系统原理及其关键技术[J].计算机安全,2007(9):14-17. 被引量:15
  • 5张雪英.数字语音处理及MATLAB仿真[M].北京:电子工业出版社,2011.
  • 6Wei Da- chuan. An improved feature extraction algorithm of humming music [ C ]// 2011 IEEE International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE). 2011 : 2 500 - 2 503.
  • 7Junqua J C, Mak B, Reaves B. A robust algorithm for word boundary detection in the presence of noise [ J ]. IEEE Trans on Speech and Audio Processing, 1994, 2(3) : 406 -412.
  • 8Beritelli F, Casale S, Ruggeri G, et al. Performances evaluation and comparison of G. 729/AMR/fuzzy voice activity detectors [ J ]. IEEE Signal Processing Letters, 2002, 9 (3) : 85 - 88.
  • 9Kennedy J, Eberhart R. Particle swarm optimization[ C ]// Proceedings 1995 IEEE International Conference on Neural Net- works. Perth: IEEE Press, 1995:1 942 -1 948.
  • 10Duran D, Schutze H, Mobius B, et al. A computational model of unsupervised speech segmentation for correspondence learn- ing[J]. Research on Language and Computation, 2010, 8(2/3) : 133 -168.

引证文献4

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部