一种基于指纹因子的鲁棒音频检索方法被引量：3

Robust Audio Retrieval Method Based on Fingerprint Factors

下载PDF

导出

摘要针对基于内容的音频检索中由于噪声造成的查找失败问题,本文提出了一种对噪声鲁棒的基于音频指纹因子的音频特征提取算法和一种半监督的音频字典训练算法,以提高噪声下音频检索的精度。本文方法从Mel谱中提取音频指纹,利用非负矩阵分解算法将指纹分解为对噪声鲁棒的频率因子和时间因子作为特征。同时通过提出的半监督音频字典训练算法进行音频字典训练,本文方法使用音效集计算基本音效的分布空间作为初始字典,在量化数据的同时动态更新字典以实现对数据的准确描述。实验结果表明,在低信噪比条件下本文提出的算法的平均查询精度明显高于其他算法。 A noise-robust fingerprint-factor-based audio feature and a semi-supervised audio dictionary training algorithm are proposed to fill up the deficiency caused by noise in content-based audio retrieval.The proposed method extracts audio fingerprint from Mel spectra and utilizes non-negative matrix factorization to factorize fingerprint into noise-robust spectral factor and temporal factor as features.Also an semi-supervised audio dictionary training algorithm is proposed.It uses an audio effect set to calculate the distribution of basic sound effects as initialized dictionary.The quantization is conducted while the dictionary is dynamically updated at the same time to better characterize data.The experimental results show that under low signal-to-noise ratio（SNR）,the proposed method significantly improves the average precision compared with other algorithms.

作者林静杨继臣张雪源李新超

机构地区茂名职业技术学院机电信息系华南理工大学电子与信息学院

出处《数据采集与处理》 CSCD 北大核心 2016年第5期1020-1027,共8页 Journal of Data Acquisition and Processing

基金国家自然科学基金(61301300)资助项目

关键词音频检索音频指纹非负矩阵分解音频字典倒排索引 audio retrieval audio fingerprint non-negative matrix factorization audio dictionary inverted index

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献18

1Weng L,Amsaleg L,Morton A,et al.A privacy-preserving framework for large-scale content-based information retrieval[J].Information Forensics and Security,IEEE Transactions on,2015,10(1):152-167.
2Awad G,Michel M,Joy D,et al.Evaluation campaigns and TRECVid[EB/OL].http://trecvid.nist.gov/,2015-05-01.
3Wang Y,Mohammed B,Bashar T.Near-duplicate video retrieval based on clustering by multiple sequence alignment[C]∥Proceedings of the 20th ACM International Conference on Multimedia.Nara,Japan:ACM,2012:941-944.
4Huurnink B,Snoek M,de Rijke M,et al.Content-based analysis improves audiovisual archive retrieval[J].IEEE Transac-tions on Multimedia,2012,14(4):1166-1178.
5Haitsma J,Kalker T.A highly robust audio fingerprinting system[C]∥3rd International Conference on Music InformationRetrieval.Paris,France:IRCAM,2002:107-115.
6Shi Jianhua,Yu Xiaoqing,Wang Yunhui,et al.Noise reduction based on nearest neighbor estimation for audio feature ex-traction[C]∥International Conference on Audio,Language and Image Processing.Shanghai,China:the Institute of Electri-cal and Electronics Engineers Press,2012:768-771.
7Malekesmaeili M,Ward K.A novel local audio fingerprinting algorithm[C]∥14th International Workshop on MultimediaSignal Processing.Banff,Canada:the Institute of Electrical and Electronics Engineers Press,2012:136-140.
8Kimura A,Kashino K,Kurozumi T,et al.A quick search method for audio signals based on a piecewise linear representationof feature trajectories[J].IEEE Transactions on Audio,Speech,and Language Processing,2008,16(2):396-407.
9Kashino K,Kurozumi T,Murase H.A quick search method for audio and video signals based on histogram pruning[J].IEEE Transactions on Multimedia,2003,5(3):348-357.
10Zhao L,Wu X,Ngo W.On the annotation of web videos by efficient near-duplicate search[J].IEEE Transactions on Multi-media,2010,12(5):448-461.

二级参考文献33

1双志伟,张世磊,秦勇.语音转换分析及相似度改进[J].清华大学学报（自然科学版）,2009(S1):1408-1412. 被引量：3
2Heryanto H, Akbar S, and Sitohang B. Direct access in content-based audio information retrieval: a state of the art and challenges[C]. 2011 International Conference on Electrical Engineering and Informatics, Bandung, Indonesia, July 17-19, 2011: 1-6.
3Ghoraani B and Krishnan S. Time-frequency matrix feature extraction and classification of environmental audio signals[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7): 2197-2209.
4Fu Zhou-yu, Lu Guo-jun, Ting Kai-ming, et al.. Music classification via the bag-of-features approach[J]. Pattern Recognition Letters, 2011, 32(14): 1768-1777.
5Su Ja-hwung, Wu Cheng-we, Fu Shao-yu, et al.. Empirical analysis of content-based music retrieval for music identification[C]. 2011 International Conference on Multimedia Technology, Hangzhou, China, July 26-28, 2011: 3516-3519.
6Jurkas P, Stefina M, Novak D, et al.. Audio similarity retrieval engine[C]. Proceedings of the Third International Conference on Similarity Search and Applications, Istanbul, Turkey,Sep. 18-19, 2010: 121-122.
7Kashino K, Kurozumi T, and Murase H. A quick search method for audio and video signals based on histogram pruning[J]. IEEE Transactions on Multimedia, 2003, 5(3): 348-357.
8Matthews B, Chaudhari U, and Ramabhadran B. Fast audio search using vector space modeling[C]. IEEE Workshop on Automatic Speech Recognition & Understanding, Kyoto, Japan, Dec. 9-13, 2007: 641-646.
9Cha Guang-ho. An effective and efficient indexing scheme for audio fingerprinting[C]. 5th FTRA International Conference on Multimedia and Ubiquitous Engineering, Loutraki, Greece June 28-30, 2011: 48-52.
10Bardeli R. Similarity search in animal sound databases[J]. IEEE Transactions on Multimedia, 2009, 11(1): 68-76.

共引文献18

1马振,张雄伟,杨吉斌.基于语音个人特征信息分离的语音转换方法研究[J].信号处理,2013,29(4):513-519. 被引量：3
2马振,张雄伟,杨吉斌,徐玉龙.基于稀疏卷积非负矩阵分解的语音转换方法研究[J].军事通信技术,2013,34(2):1-7.
3何少岩,陈蕉容,陈舜儿.基于录制环境检测的数字音频取证研究[J].计算机工程与设计,2013,34(12):4142-4145. 被引量：2
4高新波,王笛,王秀美.一种潜在信息约束的非负矩阵分解方法[J].数据采集与处理,2014,29(1):11-18. 被引量：2
5姚绍芹,张玲华.基于GMM和ANN混合模型的语音转换方法[J].数据采集与处理,2014,29(2):227-231. 被引量：1
6张立伟,贾冲,张雄伟,闵刚,曾理.稀疏卷积非负矩阵分解的语音增强算法[J].数据采集与处理,2014,29(2):259-264. 被引量：13
7张倩敏,陶亮,周健,王华彬.非对称代价函数的稀疏卷积非负矩阵分解方法[J].信号处理,2015,31(1):95-102.
8李峰,卫乃兴.基于大数据倒排索引技术的外语写作教学辅助系统研究与实现[J].外语电化教学,2015(3):31-37. 被引量：2
9李海燕,王程程,徐宁,胡芳.基于混合码书映射的高效语音转换方法[J].数据采集与处理,2016,31(3):512-524. 被引量：2
10孙卫国,夏秀渝,乔立能,叶于林.面向音频检索的音频分割和标注研究[J].微型机与应用,2017,36(5):38-41. 被引量：5

同被引文献12

1俞鹏飞,张新峰,王敏捷.基于乐纹特征和倒排索引的音乐检索系统[J].计算机应用与软件,2014,31(10):45-48. 被引量：2
2石家瑞.基于音频检索的电台广告监播系统[J].电子技术与软件工程,2015(20):89-91. 被引量：1
3陈亚杰,王锋,邓辉,刘应波.ElasticSearch分布式搜索引擎在天文大数据检索中的应用研究[J].天文学报,2016,57(2):241-251. 被引量：19
4王东旭,诸云强,潘鹏,罗侃,侯志伟.地理数据空间本体构建及其在数据检索中的应用[J].地球信息科学学报,2016,18(4):443-452. 被引量：41
5申海娟,王翾.基于数字音频指纹的广播广告检测方法研究[J].中国传媒大学学报（自然科学版）,2016,23(4):15-19. 被引量：4
6乔延臣,云晓春,庹宇鹏,张永铮.基于simhash与倒排索引的复用代码快速溯源方法[J].通信学报,2016,37(11):104-113. 被引量：9
7惠榛,冯登国,张敏,洪澄.一种可抵抗统计攻击的安全索引[J].计算机研究与发展,2017,54(2):295-304. 被引量：4
8楼凤丹,裴旭斌,王志强,纪德良.基于云计算及大数据技术的电力搜索引擎技术研究[J].电网与清洁能源,2016,32(12):86-92. 被引量：28
9李云霞.海量图书信息快速检索优化管理仿真研究[J].计算机仿真,2017,34(5):389-392. 被引量：6
10何旭峰,陈岭,陈根才,钱坤,吴勇,王敬昌.基于LDA主题模型的分布式信息检索集合选择方法[J].中文信息学报,2017,31(3):125-133. 被引量：22

引证文献3

1赵修文,刘伍颖,李甫玉,黄心怡.基于音频指纹特征的高效音乐检索方法[J].武警工程大学学报,2018,34(4):27-32.
2彭俊.基于音频检索的广播监播系统研究[J].西部广播电视,2017,38(12):196-196.
3黄立冬.分布式搜索引擎中关键词倒排索引方法仿真[J].计算机仿真,2019,36(8):380-383. 被引量：5

二级引证文献5

1封万里,王之伟,池庆国,孙志惠,岑翼刚.CDN日志全链路分析系统的实施[J].广东通信技术,2020,40(10):53-56.
2程彪,张晓明,阮晨.基于Elasticsearch的知识库和病案检索服务平台的设计与实现[J].中国病案,2021,22(3):44-48. 被引量：6
3区卓越,覃姜维,赵峰,孙晓翠.Nutch在中医药信息融合中的应用研究[J].现代计算机,2023,29(3):9-15.
4杨婷,莫若玉,张秀娟,朱洲森.轻量级缓存策略的关系型数据库全文搜索加强与扩展[J].计算机应用,2023,43(8):2431-2438. 被引量：3
5锁彤佳,吕子璇,刘伟.基于文本挖掘构建肺癌风险症状搜索推荐库[J].计算机仿真,2023,40(12):378-384.

1何其超,龙建忠,周激流.离散小波变换（DWT）在语音处理中的应用[J].四川大学学报（自然科学版）,1995,32(3):289-294. 被引量：1
2刘立东,宋焕生,靳钊.基于混沌同步的噪声鲁棒测距方法[J].电讯技术,2014,54(1):46-51. 被引量：3
3高博,王俊,张各各.基于原子曲线拟合的字典学习的信号去噪方法[J].系统仿真学报,2015,27(12):2935-2941.
4隋璐瑛,张雄伟,黄建军,董军涛.一种基于非负矩阵分解的语音增强算法[J].军事通信技术,2012,33(1):18-22. 被引量：2
5徐健,常志国,赵小强.以图像分类为目标的字典学习算法[J].现代电子技术,2013,36(2):22-25. 被引量：1
6范九伦,史香晔,徐健,张小丹.多级字典学习的图像超分辨率算法[J].西安邮电大学学报,2016,21(3):32-37. 被引量：3
7张鹏.Wcdma网络中的接入失败问题优化探讨[J].才智,2008,0(3):79-81.
8樊自甫,杨俊蓉,万晓榆.TD-SCDMA与GSM互操作中基于鉴权原因的切换失败问题分析及解决[J].电信科学,2010,26(4):52-58. 被引量：2
9黄智,余先川,王桂安,王仲妮.非负矩阵分解算法在遥感图像融合中的应用[J].北京师范大学学报（自然科学版）,2008,44(6):599-601. 被引量：6
10李轶南,张雄伟,曾理,翟武生.基于小字典训练的语音增强算法[J].军事通信技术,2013(1):30-36. 被引量：1

数据采集与处理

2016年第5期

浏览历史

内容加载中请稍等...

一种基于指纹因子的鲁棒音频检索方法被引量：3

参考文献18

二级参考文献33

共引文献18

同被引文献12

引证文献3

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

一种基于指纹因子的鲁棒音频检索方法 被引量：3

参考文献18

二级参考文献33

共引文献18

同被引文献12

引证文献3

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

一种基于指纹因子的鲁棒音频检索方法被引量：3