摘要
解决大规模音频数据库快速检索的有效手段之一是建立合适的音频索引,其中音频分割和标注是建立音频索引的基础。文中采用了一种基于短时能量和改进度量距离的两步音频分割算法,使得分割后的音频片段具有段间特征差异大、段内特征方差小的特点。在音频分割的基础上进行了音频数据库中音频流的标注;分别基于BP神经网络算法和Philips音频指纹算法对音频进行了音频类别和音频内容的标注,为后续建立音频索引表做准备。实验结果表明,两步分割算法能较好地分割任意音频流,音频标注算法能有效进行基于音频类别和音频内容的标注,算法同时具有良好的鲁棒性。
One of the effective means to solve the large-scale audio database fast retrieval is to establish an appropriate audio index,in which the audio segmentation and labeling are the basis for establishing the audio index. In this paper,a two-step audio segmentation algorithm based on short-time energy and improved metric distance is proposed,which makes the segmented audio segment have the characteristics of big difference between segments and small characteristic variance. Based on the audio segmentation,the audio stream in the audio database is annotated. Based on the BP neural network algorithm and the Philips audio fingerprint algorithm,the audio category and audio content are labeled respectively,and the audio index table is established. The experimental results show that the two-step segmentation algorithm can segment arbitrary audio stream efficiently. The audio annotation algorithm can effectively annotate audio category and audio content. The algorithm has good robustness at the same time.
作者
孙卫国
夏秀渝
乔立能
叶于林
Sun Weiguo Xia Xiuyul Qiao Lineng Ye Yulin(College of Electronics and Information,Sichuan University, Chengdu 610064 ,China 78438 Troops of the Chinese People's Liberation Army, Chengdu 610066, China)
出处
《微型机与应用》
2017年第5期38-41,共4页
Microcomputer & Its Applications
关键词
音频分割
短时能量
度量距离
音频标注
BP神经网络
音频指纹
audio segmentation
short-term energy
measurement of the distance
audio annotation
BP neural network
audio fingerprint