摘要
哼唱识别是音频检索的一个重要应用,其难点是音频歌曲数据的非结构性以及搜索速度和准确率平衡的问题。该文提出了新的数据库构造方法,将音频歌曲集用手工标注方法提取主旋律并且按自然演唱停顿方式进行分段,采用段落结构而不是整首歌作为索引。同时,提出了一种基于VP树的搜索结构以及相应的多级搜索算法,在快速匹配层采用粗搜索算法,在精确匹配层采用基于动态时间规整算法。实验证明,在对检出率影响不大的前提下,识别速度提高了40%以上。
Query by humming(QBH) is an important application for musical information retrieval.The key challenges in QBH are the unstructured data modules in audio songs and the balance between searching speed and accuracy.This paper presents a data structure for audio songs using a hand labeling method to label the melody and to divide the songs into natural segments.The search index uses the segmentation structure rather than the entire lyrics for the song.The system generates a VP-tree search structure with a multi-level searching algorithm that includes coarse searching for fast match and dynamic time warping(DTW) that leads to a fine match.Evaluations with 2 213 melody segments reduce the search time by over 40% without greatly reducing the recognition accuracy.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2009年第S1期1419-1424,共6页
Journal of Tsinghua University(Science and Technology)
基金
IBM与清华大学合作项目(2007-2008)
关键词
检索机
哼唱识别
VP树
动态时间规整
musical information retrieval
query by humming
VP-tree
dynamic time warping