期刊文献+

面向音频检索的音频分割和标注研究 被引量:5

Research on audio segmentation and annotation for audio retrieval
下载PDF
导出
摘要 解决大规模音频数据库快速检索的有效手段之一是建立合适的音频索引,其中音频分割和标注是建立音频索引的基础。文中采用了一种基于短时能量和改进度量距离的两步音频分割算法,使得分割后的音频片段具有段间特征差异大、段内特征方差小的特点。在音频分割的基础上进行了音频数据库中音频流的标注;分别基于BP神经网络算法和Philips音频指纹算法对音频进行了音频类别和音频内容的标注,为后续建立音频索引表做准备。实验结果表明,两步分割算法能较好地分割任意音频流,音频标注算法能有效进行基于音频类别和音频内容的标注,算法同时具有良好的鲁棒性。 One of the effective means to solve the large-scale audio database fast retrieval is to establish an appropriate audio index,in which the audio segmentation and labeling are the basis for establishing the audio index. In this paper,a two-step audio segmentation algorithm based on short-time energy and improved metric distance is proposed,which makes the segmented audio segment have the characteristics of big difference between segments and small characteristic variance. Based on the audio segmentation,the audio stream in the audio database is annotated. Based on the BP neural network algorithm and the Philips audio fingerprint algorithm,the audio category and audio content are labeled respectively,and the audio index table is established. The experimental results show that the two-step segmentation algorithm can segment arbitrary audio stream efficiently. The audio annotation algorithm can effectively annotate audio category and audio content. The algorithm has good robustness at the same time.
作者 孙卫国 夏秀渝 乔立能 叶于林 Sun Weiguo Xia Xiuyul Qiao Lineng Ye Yulin(College of Electronics and Information,Sichuan University, Chengdu 610064 ,China 78438 Troops of the Chinese People's Liberation Army, Chengdu 610066, China)
出处 《微型机与应用》 2017年第5期38-41,共4页 Microcomputer & Its Applications
关键词 音频分割 短时能量 度量距离 音频标注 BP神经网络 音频指纹 audio segmentation short-term energy measurement of the distance audio annotation BP neural network audio fingerprint
  • 相关文献

参考文献7

二级参考文献68

  • 1刘维华,崔涛.基于内容的音频检索算法研究[J].计算机工程与设计,2006,27(16):3003-3006. 被引量:7
  • 2王让定,徐达文.基于提升小波的多重数字音频水印[J].电子与信息学报,2006,28(10):1820-1826. 被引量:16
  • 3WANG Y, LIU Z, HUANG J C. Multimedia content analysis-using both audio and visual clues[J]. IEEE Signal Processing Magazine, 2000, 17(6): 12-36.
  • 4FOOTE J. An overview of audio information retrieval[J]. Multimedia Systems, 1999, 7(1): 2-10.
  • 5HANSEN J H L, HUANG R, ZHOU B, et al. Speechfind: advances in spoken document retrieval for a national gallery of the spoken word[J]. IEEE Transactions on Speech and Audio Processing, 2005, 13(5): 712-730.
  • 6KASHINO K, KUROZUMI T, MURASE H. A quick search method for audio and video signals based on histogram pruning[J]. IEEE Transactions on Multimedia, 2003, 5(3): 348-357.
  • 7KIM K M, KIM S Y, JEON J K, et al. Quick audio retrieval using multiple feature vectors[J]. IEEE Transactions on Consumer Electronics, 2006, 52(1): 200-205.
  • 8ZHANG W Q, LIU J. Two-stage method for specific audio retrieval[A]. IEEE International Conference on Acoustics, Speech, and Signal Processing[C]. Hawaii, 2007.
  • 9MCNAMES J. A fast nearest-neighbor algorithm based on a principal axis search tree[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(9): 964-976.
  • 10CHENG D Y, GERSHO A, RAMAMURTHI B, et al. Fast search algorithms for vector quantization and pattern matching[A]. IEEE International Conference on Acoustics, Speech, and Signal Processing[C]. San Diego,1984.

共引文献26

同被引文献18

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部