基于VP树结构的多层匹配算法在哼唱识别中的应用被引量：4

VP-tree based multi-stage matching algorithm for query-by-humming systems

导出

摘要哼唱识别是音频检索的一个重要应用,其难点是音频歌曲数据的非结构性以及搜索速度和准确率平衡的问题。该文提出了新的数据库构造方法,将音频歌曲集用手工标注方法提取主旋律并且按自然演唱停顿方式进行分段,采用段落结构而不是整首歌作为索引。同时,提出了一种基于VP树的搜索结构以及相应的多级搜索算法,在快速匹配层采用粗搜索算法,在精确匹配层采用基于动态时间规整算法。实验证明,在对检出率影响不大的前提下,识别速度提高了40%以上。 Query by humming(QBH) is an important application for musical information retrieval.The key challenges in QBH are the unstructured data modules in audio songs and the balance between searching speed and accuracy.This paper presents a data structure for audio songs using a hand labeling method to label the melody and to divide the songs into natural segments.The search index uses the segmentation structure rather than the entire lyrics for the song.The system generates a VP-tree search structure with a multi-level searching algorithm that includes coarse searching for fast match and dynamic time warping(DTW) that leads to a fine match.Evaluations with 2 213 melody segments reduce the search time by over 40% without greatly reducing the recognition accuracy.

作者侯珏刘轶郑方蒋丹宁秦勇黄石磊刘勇

机构地区清华大学信息技术研究院语音和语言技术中心清华信息科学技术国家实验室技术创新与开发部 IBM中国研究院深港产学研基地产业发展中心

出处《清华大学学报（自然科学版）》 EI CAS CSCD 北大核心 2009年第S1期1419-1424,共6页 Journal of Tsinghua University(Science and Technology)

基金 IBM与清华大学合作项目(2007-2008)

关键词检索机哼唱识别 VP树动态时间规整 musical information retrieval query by humming VP-tree dynamic time warping

分类号 TP391.3 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1Ada Wai-chee Fu,Polly Mei-shuen Chan,Yin-Ling Cheung,Yiu Sang Moon.Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances[J]. The VLDB Journal . 2000 (2)
2Ghias A,Logan J,Chamberlin D,et al.Query by humming:musical information retrieval in an audio database. ACM Multimedia 95-Electronic Proceedings . 1995
3Wu X,Li M,Liu J,et al.A top-down approach to melodymatch in pitch contour for query by humming. Proc ofInternational Symposium of Chinese Spoken LanguageProcessing . 2006
4Kruskal J B.An overview of sequence comparison:Timewarps,string edits,and macromolecules. . 1983
5Dannenberg R B,Hu N.Understanding search performancein query-by-humming systems. Proc of InternationalSymposium of Music Information Retrieval . 2004
6Jyh-Shing Roger Jang.DTW for Melody Recognition. http://neural.cs.nthu.edu.tw/jang/ .
7Wikipedia.Octave. http://en.wikipedia.org/wiki/Octave .
8Fu,Ada Wai-Chee,Chan,Polly Mei Shuen,Cheung,Yin-Ling,Moon,Yiu Sang.Dynamic vp-Tree indexing for n-nearest neighbor search given pair-wise distances. The VLDB Journal . 2000

同被引文献43

1杜志然,周萍,景新幸,李杰.基于谱熵的耳语音增强研究[J].传感器与微系统,2012,31(6):69-72. 被引量：3
2李晔,张仁智,崔慧娟,唐昆.低信噪比下基于谱熵的语音端点检测算法[J].清华大学学报（自然科学版）,2005,45(10):1397-1400. 被引量：37
3陈涛,谢阳群.文本分类中的特征降维方法综述[J].情报学报,2005,24(6):690-695. 被引量：79
4朱浩冰,郭东辉.声纹识别系统原理及其关键技术[J].计算机安全,2007(9):14-17. 被引量：15
5张雪英.数字语音处理及MATLAB仿真[M].北京:电子工业出版社,2011.
6Wei Da- chuan. An improved feature extraction algorithm of humming music [ C ]// 2011 IEEE International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE). 2011 : 2 500 - 2 503.
7Junqua J C, Mak B, Reaves B. A robust algorithm for word boundary detection in the presence of noise [ J ]. IEEE Trans on Speech and Audio Processing, 1994, 2(3) : 406 -412.
8Beritelli F, Casale S, Ruggeri G, et al. Performances evaluation and comparison of G. 729/AMR/fuzzy voice activity detectors [ J ]. IEEE Signal Processing Letters, 2002, 9 (3) : 85 - 88.
9Kennedy J, Eberhart R. Particle swarm optimization[ C ]// Proceedings 1995 IEEE International Conference on Neural Net- works. Perth: IEEE Press, 1995:1 942 -1 948.
10Duran D, Schutze H, Mobius B, et al. A computational model of unsupervised speech segmentation for correspondence learn- ing[J]. Research on Language and Computation, 2010, 8(2/3) : 133 -168.

引证文献4

1谢志成,张栋.基于粒子群优化的哼唱语音端点检测算法[J].福州大学学报（自然科学版）,2014,42(2):195-199. 被引量：1
2鲁晓倩,关胜晓.基于VQ和GMM的实时声纹识别研究[J].计算机系统应用,2014,23(9):6-12. 被引量：3
3王昂,李川.分布式度量索引模型设计研究[J].现代计算机,2015,21(2):14-17.
4李哲军,周萍,景新幸.基于改进噪声估计的谱减法应用于说话人识别[J].计算机测量与控制,2016,24(4):155-158.

二级引证文献4

1李酉杰.三维激光源扫描定位系统的声纹变化识别[J].激光杂志,2017,38(7):204-207. 被引量：1
2孙桂琪,庄晓东,范珍艳.基于快速样本熵计算的清浊音判决与语音分割[J].青岛大学学报（工程技术版）,2018,33(4):98-103.
3李蜜.基于语谱图和神经网络的声纹识别研究[J].高师理科学刊,2020,40(4):39-42. 被引量：2
4姜涛.一种基于声纹识别的智能测井解释系统知识库保护方法[J].电子测量技术,2020,43(19):103-106. 被引量：2

1曹文晓,刘轶,郑方,蒋丹宁,秦勇.用于哼唱识别精确匹配的线性伸缩动态规划算法[J].清华大学学报（自然科学版）,2009(S1):1402-1407.
2马仲兵.检索机的安全配置[J].电脑知识与技术（技术论坛）,2005(12):21-22. 被引量：3
3郑玉明,史晶蕊,廖湖声.文本分类的神经网络模型[J].计算机工程,2005,31(21):37-39. 被引量：4
4史晶蕊,郑玉明,廖湖声.隐藏层与文档段落相关的神经网络分类器[J].计算机工程与应用,2006,42(12):193-196.
5杨峰,冯研.利用Squid实现高校图书馆OPAC检索机的控制[J].中华医学图书情报杂志,2013,22(10):59-61.
6Internet信息导航纵横谈[J].网络与信息,1997,11(2):50-51.
7刘志强.基于DTW的哼唱识别系统的研制[J].信息与电脑（理论版）,2009(12):25-26. 被引量：2
8王勇.体验全新的搜狗音乐盒[J].电脑爱好者（普及版）,2008,0(7):58-58.
9花的神明.追寻网页上闪动的音乐[J].电脑迷,2007,0(12):63-63. 被引量：1
10吴国年.基于NET的检索机管理软件的开发[J].河南图书馆学刊,2010,30(5):99-101.

清华大学学报（自然科学版）

2009年第S1期

浏览历史

内容加载中请稍等...

基于VP树结构的多层匹配算法在哼唱识别中的应用被引量：4

参考文献8

同被引文献43

引证文献4

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于VP树结构的多层匹配算法在哼唱识别中的应用 被引量：4

参考文献8

同被引文献43

引证文献4

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于VP树结构的多层匹配算法在哼唱识别中的应用被引量：4