摘要
针对基于内容的音频检索中由于噪声造成的查找失败问题,本文提出了一种对噪声鲁棒的基于音频指纹因子的音频特征提取算法和一种半监督的音频字典训练算法,以提高噪声下音频检索的精度。本文方法从Mel谱中提取音频指纹,利用非负矩阵分解算法将指纹分解为对噪声鲁棒的频率因子和时间因子作为特征。同时通过提出的半监督音频字典训练算法进行音频字典训练,本文方法使用音效集计算基本音效的分布空间作为初始字典,在量化数据的同时动态更新字典以实现对数据的准确描述。实验结果表明,在低信噪比条件下本文提出的算法的平均查询精度明显高于其他算法。
A noise-robust fingerprint-factor-based audio feature and a semi-supervised audio dictionary training algorithm are proposed to fill up the deficiency caused by noise in content-based audio retrieval.The proposed method extracts audio fingerprint from Mel spectra and utilizes non-negative matrix factorization to factorize fingerprint into noise-robust spectral factor and temporal factor as features.Also an semi-supervised audio dictionary training algorithm is proposed.It uses an audio effect set to calculate the distribution of basic sound effects as initialized dictionary.The quantization is conducted while the dictionary is dynamically updated at the same time to better characterize data.The experimental results show that under low signal-to-noise ratio(SNR),the proposed method significantly improves the average precision compared with other algorithms.
出处
《数据采集与处理》
CSCD
北大核心
2016年第5期1020-1027,共8页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(61301300)资助项目
关键词
音频检索
音频指纹
非负矩阵分解
音频字典
倒排索引
audio retrieval
audio fingerprint
non-negative matrix factorization
audio dictionary
inverted index