改进的音频混合分割方法被引量：4

Research on the Improved Hybrid Segmentation Algorithm for Audio

下载PDF

导出

摘要针对基于距离和贝叶斯信息准则的混合分割算法在候选分割点确认时存在过于激进、容易造成分割点丢失的问题,提出一种保守的分割点确认方法,使被否定的候选分割点有多次机会被检验;针对固定的惩罚因子无法兼顾准确率和查全率的问题,提出了基于可检测度的惩罚因子自适应算法,并在一个启发式规则的基础上对基于可检测度的惩罚因子自适应进行扩充,实现了基于可检测度和启发式规则的惩罚因子自适应方法.实验结果表明,文中算法明显优于已有算法,且在性能上得到了很大提升. DISTBIC is a typical hybrid audio segmentation algorithm, but its validation method for candidate changing points is too radical and may lose the true changing points. To address this problem, we develop a new BIC validation algorithm, which gives the candidate points several opportunities to be validated. For a fixed penalty factor value, it is difficult to set a proper value to achieve both high recall rate and high precision. Based on the detect ability and heuristic rule, an adaptive setting method of the penalty factor is designed to solve this problem. The experimental results indicate that the improved algorithm is superior to the original algorithms.

作者于俊清胡小强孙凯

机构地区华中科技大学计算机学院华中科技大学网络与计算中心

出处《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2010年第7期1174-1181,共8页 Journal of Computer-Aided Design & Computer Graphics

基金国家自然科学基金(60703049) 武汉市青年科技晨光计划(200850731353))

关键词音频分割贝叶斯信息准则惩罚因子自适应 audio segmentation Bayesian information criterion penalty factor adaptive

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献14

1Cheng S S, Wang H M. A sequential metric-based audio segmentation method via the Bayesian information eriterion [C] //Proceedings of Eurospeech, Geneva, 2003: 945-948.
2Chen S S, Gopalakrishnan P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [C] //Proceedings of the DARPA Workshop, Lansdowne, 1998: 127-132.
3Cettolo M, Vescovi M. Efficient audio segmentation algorithms based on the BIC [C] //Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, 2003:537-540.
4Tritsehler A, Gopinath R. Improved speaker segmentation and segments clustering using the Bayesian information criterion [C]//Proceedings of the Eurospeech, Budapest, 1999 : 2997-3000.
5Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [J]. Computer Speech and Language, 2005, 19(2) : 147- 170.
6Sivakumaran P, Fortuna J, Ariyaeeinia A M. On the use of the Bayesian information criterion in multiple speaker detection[C] //Proceedings of the Eurospeech, Scandinavia, 2001:795-798.
7Ajmera J, MeCowan I A, Bourlard H. Robust HMM based speech/music segmentation [C] //Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, 2002:297-300.
8Gauvain J L, Lamel L, Adda G. The LIMSI broadcast news transcription system [J]. Speech Communication, 2002, 37 (1): 89-108.
9Lu L, Li S Z, Zhang H J. Content-based audio segmentation using support vector machines [C] //Proceedings of International Conference on Multimedia and Expro, Tokyo, 2001 : 749-752.
10张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量：7

二级参考文献24

1Chou W.,Gu L..Robust singing detection in speech/music discriminator design.In:Proceedings of the IEEE ICASSP,Salt Lake City,USA,2001,2:865～868
2Ajmera J.,Mccowan I.A.,Bourlard H..Robust HMM-based speech/music segmentation.In:Proceedings of the IEEE ICASSP,Orlando,USA,2002,1:297～300
3Sundaram H.,Chang S.F..Audio scene segmentation using multiple features,models and time scales.In:Proceedings of the IEEE ICASSP,Istanbul,Turkey,2000,4:2441～2444
4Foote J..Automatic audio segmentation using a measure of audio novelty.In:Proceedings of the IEEE Multimedia and Expo,New York,USA,2000,1:452～455
5Kemp T.,Schmidt M.,Waibel A..Strategies for automatic segmentation of audio data.In:Proceedings of the IEEE ICASSP,Istanbul,Turkey,2000,3:1423～1426
6Zhang T.,Kuo C.J..Audio content analysis for online audiovisual data segmentation and classification.IEEE Transactions on Speech and Audio Processing,2000,9(4):441～457
7Lu L.,Zhang H.J.,Jiang H..Content analysis for audio classification and segmentation.IEEE Transactions on Speech and Audio Processing,2002,10(7):504～516
8Bobrek M.,Koch D.B..Music signal segmentation using tree-structured filter banks.Journal of the Audio Engineering Society,1998,46(5):412～427
9Zhang Y.B.,Zhou J..A study on content-based music classification.In:Proceedings of the 7th IEEE International Symposium on Signal Processing and Its Applications,Paris,France,2003,2:113～116
10Li D.G.,Sethi I.K.,Dimitrova N.,Mcgee T..Classification of general audio data for content-based retrieval.Pattern Recognition Letters,2001,22(5):533～544

共引文献21

1陈莘萌,陈刚,姚昱.基于最小平均复杂度的矢量量化音频分类方法[J].武汉大学学报（理学版）,2005,51(1):69-73. 被引量：1
2杨新旭,王长山,王东琦,郑丽娜.基于隐马尔可夫模型的入侵检测系统[J].计算机工程与应用,2005,41(12):149-151. 被引量：9
3李超,熊璋,薛玲,刘云.一种阈值自适应调整的实时音频分割方法[J].北京航空航天大学学报,2005,31(12):1317-1321. 被引量：2
4张世磊,张树武,徐波.一种两层次无监督的音频分割算法[J].中文信息学报,2007,21(2):106-111. 被引量：5
5付中华,张艳宁.在线无监督说话人检索中稳健的模型自举算法[J].软件学报,2007,18(3):608-616. 被引量：3
6万旺根,常辽豫,余小清,崔滨,刘晗.音频信息检索研究现状与发展趋势[J].上海大学学报（自然科学版）,2007,13(4):363-370. 被引量：3
7朱映映,明仲,周景洲.一种面向基于内容视频检索的音频场景分割方法[J].小型微型计算机系统,2008,29(3):557-562.
8王志明,周序生.基于定长窗分层检测的音频分割算法[J].中小企业管理与科技,2009(21):296-297.
9郑继明,俞佳.基于GLR距离和BIC的混合音频分割算法[J].计算机工程与设计,2009,30(13):3120-3123. 被引量：3
10王天江,陈刚,刘芳.一种按节拍动态分帧的歌曲有歌唱部分检测新方法[J].小型微型计算机系统,2009,30(8):1561-1564. 被引量：2

同被引文献30

1张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量：7
2ISHI C T, ISHIGURO H, HAGITA N. Automatic extraction of para- linguistic information using prosodic features related to F0, duration and voice quality[ J]. SCI, Speech Communication 50, 2008 : 531 - 543.
3CHENG S S, WANG H M.A. Sequential metric to based audio segmen- tation method via the Bayesian information criterion [ C]// Proceedings of Eurospeech. Geneva: University of Geneva, 2003:945 -948.
4CHEN S S, GOPLALAKRISHNAN P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [ C ]// proceedings of the DARPA workshop. Lansdowne : [ s. n. ] , 1988 : 127 - 132.
5CETI'OLO M, VESCOVI M. Efficient audio segmentation algorithms based on the BIC [ C ]//Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hang Kong : IEEE, 2003 : 537 - 540.
6Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [ J]. Computer Speech and Language, 2005, 19f2) : 147 -170.
7MAO QiRong, WANG XiaoJia, ZHAN YongZhao. Speech emotion recognition method based on improved decision tree and layered fea- ture selection [ J ]. International Journal of Humanoid Robotics, 2010:245 - 261.
8Taras Butko,Climent Nadeu. Audio segmentation of broadcast news in the Albayzin-2010 evaluation: Overview, results, and discussion [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2011 (1): 1-10.
9Sebastien Lefevre, Nicole Vincent. A two level strategy for au- dio segmentation[J]. Journal of Digital Signal Processing, 2010, 21 (2): 270-277.
10Dalibor Mitrovic, Matthias Zeppelzauer, Christian Breithene- der. Features for content-based audio retrieval [J]. Journal of Advances in Computer, 2010, 78 (10): 71-150.

引证文献4

1赵小蕾,毛启容,詹永照.融合功能性副语言的语音情感识别新方法[J].计算机科学与探索,2014,8(2):186-199. 被引量：5
2赵小蕾,赵慧青.说话人功能性副语音自动检测算法[J].智能计算机与应用,2015,5(1):73-76. 被引量：1
3冷娇娇,赵彤洲,方晖,李翔,李碧.基于方差稳定性度量的乐器音频分割算法[J].计算机工程与设计,2016,37(3):768-772. 被引量：4
4王方丽,傅嘉俊.基于Python的BIC语音分割算法的实现与应用[J].计算机与数字工程,2020,48(4):763-766. 被引量：3

二级引证文献12

1赵小蕾,赵慧青.说话人功能性副语音自动检测算法[J].智能计算机与应用,2015,5(1):73-76. 被引量：1
2刘莹,赵彤洲,江逸琪,柴悦,李翔.基于自相关函数的钢琴乐音改进识别算法[J].武汉工程大学学报,2018,40(2):208-213. 被引量：6
3曹春香.语音特征和情感特征的翻译系统与实现[J].现代电子技术,2018,41(13):123-127. 被引量：1
4刘莹,赵彤洲,邹冲,赵娜.基于频谱包络分析的音乐推荐算法[J].软件导刊,2018,17(6):74-76. 被引量：5
5赵小蕾,许喜斌.融合浅层学习和深度学习模型的语音情感识别[J].计算机应用与软件,2020,37(12):108-112. 被引量：2
6刘超.基于频谱包络的钢琴乐音仿真模型构建[J].自动化技术与应用,2021,40(6):104-108. 被引量：4
7罗德虎,冉启武,杨超,豆旺.语音情感识别研究综述[J].计算机工程与应用,2022,58(21):40-52. 被引量：8
8余佳琪,王冬霞,马晓冬,张严.一步优化OSAHS鼾声分类算法[J].实验室研究与探索,2023,42(7):136-140.
9陆思宇,姜囡.基于谱熵法的低信噪比案件语音分割聚类研究[J].广东公安科技,2023,31(3):23-27.
10孙颖,周雅茹,张雪英.融合功能性副语言比例系数的语音情感识别[J].东北大学学报（自然科学版）,2024,45(1):40-48.

1储岳中.一类基于贝叶斯信息准则的k均值聚类算法[J].安徽工业大学学报（自然科学版）,2010,27(4):409-412. 被引量：15
2白亮,老松杨,陈剑赟,吴玲达.基于支持向量机的音频分类与分割[J].计算机科学,2005,32(4):87-90. 被引量：13
3赵凯,史长琼,张理阳.基于聚类分析的P2P流量识别[J].长沙理工大学学报（自然科学版）,2010,7(3):58-62. 被引量：3
4白志杰,李弼程,彭天强.基于BIC的新闻视频近似重复帧检测方法[J].计算机应用,2009,29(6):1694-1695.
5冉骥,朱翠涛.基于SIP的多媒体会议系统中混音设计与实现[J].中南民族大学学报（自然科学版）,2007,26(4):62-65.
6邸若海,高晓光,郭志高.基于改进BIC评分的贝叶斯网络结构学习[J].系统工程与电子技术,2017,39(2):437-444. 被引量：10
7张瑞杰,李弼程,屈丹.基于可信度变化趋势的音频分割算法[J].计算机工程,2010,36(8):177-179. 被引量：3
8许明,韩军伟,郭雷,尹文杰.利用模型选择确定视觉词袋模型中词汇数目[J].计算机工程与应用,2011,47(31):148-150. 被引量：3
9续鸿飞,肖明.音频检索综述[J].晋图学刊,2005(6):15-19. 被引量：8
10赵航.浅谈CooleditPro混编技巧[J].网迷,2002(1):91-93.

计算机辅助设计与图形学学报

2010年第7期

浏览历史

内容加载中请稍等...

改进的音频混合分割方法被引量：4

参考文献14

二级参考文献24

共引文献21

同被引文献30

引证文献4

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

改进的音频混合分割方法 被引量：4

参考文献14

二级参考文献24

共引文献21

同被引文献30

引证文献4

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

改进的音频混合分割方法被引量：4