一种新的基于分类的音频流分割方法被引量：10

A Novel Classification-Based Audio Segmentation Algorithm

下载PDF

导出

摘要很多传统的音频流分割方法都是基于小尺度音频分类的,它们普遍存在虚假分割点过多的缺点,严重影响了实际应用的效果.我们的研究表明,大尺度音频片段的分类正确率明显高于小尺度音频片段的分类正确率.基于这个事实和减少虚假分割点的目的,我们提出了一种新的基于分类的音频流分割方法.首先,采用基于大尺度分类的分割方法对音频流进行粗分割,然后采用基于小尺度分类的细分割步骤在边界区域中进一步精确定位分割点.理论分析和实验结果均表明,当处理类别变换频率较低的音频流时,这种分割方法在保持真实分割点检测率的同时能够大幅降低虚假分割率. Content-based audio segmentation plays an important role in multimedia applications. Many conventional segmentation algorithms are based on small-scale classification and always result in a high false alarm rate. Our experimental results show that large-scale audio can be more easily classified than small ones, and this trend is irrespective of classifiers. According to this fact,we present a novel framework for audio segmentation to reduce the false seg- mentations. First,a rough segmentation step based on large-scale classification is taken to ensure the integrality of the content of segments. Then a subtle segmentation step based on small-scale classification is taken to further locate the segmentation points from the boundary areas computed by the rough segmentation step. Both theoretical analysis and ex- perimental results show that nearly 3/4 false segmentation points can be reduced comparing to the conventional audio segmentation method based on small-scale audio classification, while preserving a low missing rate, when infrequently type-changed audio streams are dealt. So it can be concluded that it is very suitable for the real tasks such as music broadcast segmentation or music video analysis.

作者张一彬周杰边肇祺张大鹏

机构地区清华大学自动化系香港理工大学计算机学系

出处《电子学报》 EI CAS CSCD 北大核心 2006年第4期612-617,共6页 Acta Electronica Sinica

基金国家自然科学基金(No.60573060No.60205002No.60332010) 北京市自然科学基金(No.4042020)

关键词音频分类音频分割虚假分割神经网络 audio classification audio segmentation false segmentation rate neural network

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献12

1Chou W, Gu L. Robust singing detection in speech/music discriminator design[ A]. In. Proc ICASSP[ C ].Salt Lake City, USA : IEEE,2001,2:865 - 868.
2Ajmera J, Mccowan I A, Bourlard H. Robust HMM-based speech/music segmentation [ A ]. In: Proc ICASSP[ C]. Orlando, USA: IEEE,2002 ,1:297 -300.
3Sundaram H, Chang S F. Audio scene segmentation using multiple features, models and time scales [ A ]. In:IEEE Proc ICASSP [ C ]. Istanbul, Turkey: IEEE, 2000.4.2441 - 2444.
4Foote J. Automatic audio segmentation using a measure of audio novelty [ A ]. In: IEEE Proc Multimedia and Expo [ C ]. New York, USA: IEEE, 2000.1. 452 - 455.
5Kemp T, Schmidt M, Waibel A. Strategies for automatic segmentation of audio data [ A ]. In: IEEE Proc. ICASSP[ C ]. Istanbul, Turkey: IEEE,2000.3. 1423 - 1426.
6Zhang T, Kuo C J. Audio content analysis for online audiovisual data segmentation and classification [ J ]. IEEE Trans Speech and Audio Processing ,2000,9 (4) :441 -457.
7Lu L, Zhang H J, Jiang H. Content analysis for audio classification and segmentation [ J ]. IEEE Trans, Speech and Audio Processing,2002,10 ( 7 ) : 504 - 516.
8Zhang Y B, Zhou J. A study on content-based music classification[ A ]. In. IEEE Proc Seventh International Symposium on Signal Processing and Its Applications[ C ]. Paris, France, IEEE, 2003.2.113 - 116.
9Li D G, Sethi I K, Dimitrova N, Mcgee T. Classification of general audio data for content-based retrieval [ J ].Pattern Recognition Letters, 2001,22 (5) :533 - 544.
10Ripley B D. Pattern recognition and neural networks[ M ]. London: Cambridge University Press, 1996.

同被引文献107

1张一彬,周杰,边肇祺.基于样本的流行歌曲关键段分割方法[J].电子学报,2006,34(2):220-225. 被引量：4
2齐峰岩,鲍长春.一种基于支持向量机的含噪语音的清/浊/静音分类的新方法[J].电子学报,2006,34(4):605-611. 被引量：12
3倪宁,卢刚,卜佳俊.基于音频分析的视频场景检测[J].计算机仿真,2006,23(8):184-187. 被引量：3
4张世磊,张树武,徐波.一种两层次无监督的音频分割算法[J].中文信息学报,2007,21(2):106-111. 被引量：5
5张一彬,周杰,边肇祺,郭军.基于内容的音频与音乐分析综述[J].计算机学报,2007,30(5):712-728. 被引量：18
6吕国云,蒋冬梅,蒋晓悦,赵荣椿,侯云舒,孙阿利,H.Sahli,W.Verhelst.基于动态贝叶斯网络的音视频连续语音识别和音素切分[J].计算机应用,2007,27(7):1670-1673. 被引量：2
7Meinedo H, Neto J. Audio segmentation, classification and clustering in a broadcast news task. IEEE International Conference on Acoustics, Speech and Signal Processing. Hong Kong, China: Kluwer Academic Press,2003 ;5-8.
8Wu Chunghsien, Hsieh Chiahsin. Multiple change-point audio segmentation and classification using an MDL-based ganssian model. IEEE Transactions on Audio, Speech and Language Processing,2006 ; 14 : 647-657.
9Woodland P, Gales M, Pye D,et al. The development of the 1996 HTK broadcast news transcription system. In: Proc Speech Recognition Workshop, 1997 : 73-78.
10Bakis R, Chen S, Gopalakrishnan P, et al. Transcription of broadcast news shows with the IBM large vocabulary speech recognition system. In: Proceedings of the DARPA Speech Recognition Workshop Chantilly, 1997:67-72.

引证文献10

1王志明,周序生.基于定长窗分层检测的音频分割算法[J].中小企业管理与科技,2009(21):296-297.
2王志明,张瑞杰,李弼程.基于分层熵检测的音频分割算法[J].科学技术与工程,2009,9(17):5012-5016. 被引量：1
3王志明,周序生.基于定长窗分层检测的音频分割算法[J].计算机仿真,2009,26(9):350-354. 被引量：1
4王志明.一种有效的音频分割算法[J].湖南理工学院学报（自然科学版）,2009,22(3):37-40. 被引量：3
5张瑞杰,李弼程,屈丹.基于可信度变化趋势的音频分割算法[J].计算机工程,2010,36(8):177-179. 被引量：3
6李昌莲,余小清,许雪琼,万旺根.低信噪比环境下基于PR的音频分割[J].计算机仿真,2010,27(6):354-357.
7蒋盛益,李霞,李碧,王连喜.音乐情感自动分析研究[J].计算机工程与设计,2010,31(18):4112-4115. 被引量：8
8芮瑞,鲍长春.基于非线性动力学的乐器分类方法[J].电子学报,2012,40(7):1481-1488. 被引量：2
9黄强,吴一波,何飞,纪震.基于衰减因子的虚拟环绕系统听音区扩大方法[J].电子学报,2012,40(11):2342-2345.
10廖伟,袁纵横.基于自适应阈值与基频检测的自发性口语音频分割算法[J].计算机应用与软件,2015,32(4):133-136.

二级引证文献16

1唐晓萍.音乐评论中情感的挖掘[J].科技资讯,2012,10(4):237-237.
2陈雅茜.音乐推荐系统及相关技术研究[J].计算机工程与应用,2012,48(18):9-16. 被引量：14
3张玉珍,夏肇霖,王建宇,戴跃伟.基于音频和文本融合的广告单元分割[J].南京理工大学学报,2012,36(3):396-401. 被引量：3
4赵晓燕.试分析基于音频和文本融合的广告单元分割[J].商场现代化,2013(21):103-103.
5魏华珍,赵姝,陈洁,刘峰.特征组合的中文音乐情感识别研究[J].安徽大学学报（自然科学版）,2014,38(6):30-36. 被引量：5
6蒋盛益,阳垚,廖静欣.中文音乐情感词典构建及情感分类方法研究[J].计算机工程与应用,2014,50(24):118-121. 被引量：12
7郅逍遥,李临生,郭喆,郭一娜,闫庆森.基于相空间重构和柔性神经树的乐器分类[J].计算机应用与软件,2015,32(2):159-162. 被引量：4
8邵曦,陶凯云.基于音乐内容和歌词的音乐情感分类研究[J].计算机技术与发展,2015,25(8):184-187. 被引量：4
9郭琳,苏洁,李余芳,刘敬凤,胡文君,潘文林.一种人机交互语音切分系统[J].云南民族大学学报（自然科学版）,2016,25(1):87-91. 被引量：4
10冷娇娇,赵彤洲,方晖,李翔,李碧.基于方差稳定性度量的乐器音频分割算法[J].计算机工程与设计,2016,37(3):768-772. 被引量：4

1白亮,老松杨,陈剑赟,吴玲达.基于支持向量机的音频分类与分割[J].计算机科学,2005,32(4):87-90. 被引量：13
2张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量：7
3续鸿飞,肖明.音频检索综述[J].晋图学刊,2005(6):15-19. 被引量：8
4蔺国梁.基于压缩域特征的音频识别算法[J].甘肃联合大学学报（自然科学版）,2011,25(6):69-73. 被引量：1
5蔡群,陆松年,杨树堂.基于音视特征的视频内容检测方法[J].计算机工程,2007,33(22):240-242. 被引量：4
6李晨,周明全.音频检索技术研究[J].计算机技术与发展,2008,18(8):215-218. 被引量：7
7于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量：4
8蔡树敏,郑洪英,陈剑勇.远程教学系统课程在线编辑的设计与实现[J].计算机工程与设计,2014,35(6):2230-2233. 被引量：7
9刘文辉,蚩志锋.基于内容的音频数据检索研究[J].喀什师范学院学报,2009,30(6):57-59.
10王志明.一种有效的音频分割算法[J].湖南理工学院学报（自然科学版）,2009,22(3):37-40. 被引量：3

电子学报

2006年第4期

浏览历史

内容加载中请稍等...

一种新的基于分类的音频流分割方法被引量：10

参考文献12

同被引文献107

引证文献10

二级引证文献16

相关作者

相关机构

相关主题

浏览历史

一种新的基于分类的音频流分割方法 被引量：10

参考文献12

同被引文献107

引证文献10

二级引证文献16

相关作者

相关机构

相关主题

浏览历史

一种新的基于分类的音频流分割方法被引量：10