一种基于说话者的无监督语音分割算法被引量：3

An unsupervised speech segmentation algorithm based on the speaker

下载PDF

导出

摘要手机对话语音中2个说话者之间存在着信道和声学特征上的差异,利用这种差异可以从对话语音中分出属于每个话者的语音部分。文章重点讨论了一种基于距离的无监督语音分割算法,并比较了欧氏距离及广义似然比和持续时间相结合的2种距离测度,后者利用假设检验的似然比来描述2个语音段之间的相似性,通过与文本无关的手机对话语音的话者确认系统实验,表明了它比前者更优越,能较好地检测出绝大部分的说话者改变点,且计算代价也较低。 There are differences between the channels and acoustic characters of the two speakers in cellular conversation,which can be applied to segment the speech of each speaker from the cellular conversation.An unsupervised metric-based speech segmentation algorithm is mainly discussed in this paper.And Euclidean distance measure and the distance measure based on generalized likelihood ratio（GLR） and duration are compared.The latter makes use of the likelihood ratio of hypothesis testing to describe the similarity between two speech segments.The text-independent speaker verification system shows the measure based on GLR and duration is better in verifying segment points with low computation cost.

作者高福友陈雁翔

机构地区浙江警官职业学院安全防范系合肥工业大学计算机与信息学院

出处《合肥工业大学学报（自然科学版）》 CAS CSCD 北大核心 2010年第5期683-686,708,共5页 Journal of Hefei University of Technology：Natural Science

基金浙江省安防系统测试资助项目(DB33/T334)

关键词手机对话语音 GLR距离测度无监督语音分割 cellular conversation generalized likelihood ratio（GLR） distance measure unsupervised speech segmentation

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献8

1Gish H,Siu M H,Rohlicek R.Segregation of speakers for speech recognition and speaker identification[C] //Proceed-ing of the International Conference on Acoustics,Speech and Signal Processing(ICASSP),Toronto,2001:873-876.
2Meignier S,Bonastre J F,Chagnollesu I M.Speaker utter-ances tying among speaker segmented audio documents u-sing hierarchical classification:towards speaker indexing of audio databases[C] //Proceeding of the International Con-ferenee on Speech Language Processing(ICSLP),Denver,2002:577-580.
3Jin H,Kubala F,Schwartz R.Automatic speaker clustering[C] //Proceeding of the DARPA Speech Recognition Work-shop,Chantilly,2007:108-111.
4Reynolds D A.Singer E.Blind clustering of speech utter-ances based on speaker and language characteristics[C] //Proceeding of the International Conference on Speech and Language Processing(ICSLP),Sydney,1998:3193-3196.
5Bakis R,Chen S,Gopalakrishnan P S,et al.Transcription of broadcast news shows with the IBM large vocabulary speech recognition system[C] //Proceeding of the DARPA Speech Recognition Workshop,Chantilly,2007:67-72.
6Delacourt P,Wellekens C.DISTBIC:a speaker-based seg-mentation for audio data indexing[J].Speech Communica-tions,2000,32:111-126.
7张世磊,张树武,徐波.一种两层次无监督的音频分割算法[J].中文信息学报,2007,21(2):106-111. 被引量：5
8Bonastre J F,Delacourt P,Fredouille C,A speaker tracking system based on speaker turn detection for NIST evaluation[C] //Proceeding of the International Conference on Acous-tics,Speech and Signal Processing(ICASSP),Istanbul,2000:1177-1180.

二级参考文献11

1NIST Spoken Language Technology Evaluations: Benchmark Tests [EB/OL]. http://www. nist. gov/speech/tests/index. htm.
2Zhou B, Hansen J. Efficient audio stream segmentation via T2 statistic based Bayesian information criterion[J]. IEEE Transactions on Speech Audio Process,2005, 13(4): 467-474.
3Chen S, Gopalakrishnan P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [A]. DARPA Broadcast News Trans. and Under [C]. Workshop, 1998.8.
4Delacourt P, Wellekens CJ. DISTBIC: a speaker-based segmentation for audio data indexing [J].Speech Communication, 2000, 32: 111-126.
5Lu L, Zhang HJ. Real-Time Unsupervised Speaker Change Detection [A]. In: Proceedings of ICPR (2)2002 [C]. Quebec, Canada, 2002: 358-361.
6Cheng S, Wang H. METRIC-SEQDAC: A Hybrid Approach for Audio Segmentation [A]. In: Proceedings of ICSLP2004 [C]. Jeju Island, Korea, 2004:1617-1620.
7Cheng S, Wang H. A Sequential Metric-based Audio Segmentation Method via The Bayesian Information Criterion [A]. In: Proceedings of Eurospeech2003[C]. Geneva, Switzerland, 2003: 945-948.
8Zhou B, Hansen J. Unsupervised Audio Stream Segmentation and Clustering Via the Bayesian Information Criterion [A]. In: Proceedings of ICSLP2000[C]. China, 2000:714-717.
9J. Ajmera. Robust Audio Segmentation [D]. Ph. D.Thesis, 2004.
10贾磊,穆向禺,徐波.广播语音的音频分割[J].中文信息学报,2002,16(1):37-42. 被引量：11

共引文献4

1王志明,周序生.基于定长窗分层检测的音频分割算法[J].计算机仿真,2009,26(9):350-354. 被引量：1
2郑继明,张萍.改进的BIC说话人分割算法[J].计算机工程,2010,36(17):240-242. 被引量：7
3郑继明,司可宁.改进的T^2-BIC说话人二级分割算法[J].计算机工程,2011,37(6):291-292. 被引量：1
4陈国艳,张颖,梁德群.基于BIC准则的图像分割算法[J].辽宁工程技术大学学报（自然科学版）,2016,35(11):1359-1362. 被引量：1

同被引文献17

1孙冬梅,裘正定.生物特征识别技术综述[J].电子学报,2001,29(z1):1744-1748. 被引量：142
2汪光华.构建科技创安系统加强监狱安全防范[J].中国安全科学学报,2006,16(12):123-129. 被引量：5
3张翔,王德石,李景熹.变步长LMS自适应滤波器算法仿真研究[J].微计算机信息,2007,23(19):252-253. 被引量：14
4K.Grabowski,W.Sankowski,M.Zubert.Iris Structure Acquisition Method,2009.
5S.S.Chowhan,G.N.Shinde.Iris Biometrics Recognition Application in Security Management,2008.
6M.N.Islam,M.A.Siddiqui,S.Paul.An efficient retina pattern recognition algorithm (RPRA)towards human identification,2009.
7张薇,刘加.电话语音的多说话人分割聚类研究[J].清华大学学报（自然科学版）,2008,48(4):574-577. 被引量：6
8王庆辉,李永哲.VoIP声学回声消除算法研究[J].现代电子技术,2009,32(7):157-159. 被引量：8
9江冰,叶玲,朱成健,曾为.一种提高稳态误差及收敛速度的回声消除算法[J].声学技术,2009,28(3):295-299. 被引量：2
10岳峰,左旺孟,张大鹏.掌纹识别算法综述[J].自动化学报,2010,36(3):353-365. 被引量：64

引证文献3

1高福友.生物特征识别技术及其在监狱安全防范领域的应用[J].安防科技,2011(2):20-23. 被引量：4
2魏臻,凌勇,程磊,程运安.IP语音通话中回声消除算法的研究[J].合肥工业大学学报（自然科学版）,2011,34(5):687-690. 被引量：2
3陆思宇,姜囡.典型多说话人语音自动分割算法研究[J].警察技术,2024(2):35-38.

二级引证文献6

1任林茂,李盛,黄从坚.生物特征识别技术及其应用[J].企业技术开发（中旬刊）,2013,32(3):44-45. 被引量：1
2孔会敏.基于模糊判别的面像识别技术[J].信息技术与信息化,2015(2):224-225.
3贾宗圣.语音调度系统的回声消除技术[J].中国新通信,2017,19(9):72-73.
4姚月琴,杨彦,陈林,赵力.VoIP电话中的回声消除算法[J].电子器件,2018,41(6):1618-1621. 被引量：2
5王彩霞.浅谈监狱精神卫生安全问题[J].法制博览,2017(33):221-221.
6王丹.生物特征识别技术高价值专利分析与培育路径研究[J].专利代理,2021(1):51-57.

1陈雁翔,戴蓓倩,周曦,李辉.基于对话语音的与文本无关的说话人确认系统的研究[J].中文信息学报,2004,18(2):36-43. 被引量：4
2吴跃前,杜明辉.基于奇异性的语音端点检测方法[J].计算机工程与设计,2008,29(10):2591-2594. 被引量：1
3卢坚,毛兵,孙正兴,张福炎.一种改进的基于说话者的语音分割算法[J].软件学报,2002,13(2):274-279. 被引量：17
4陈亮,张雄伟.基于分形维数实现语音分割和增强[J].北京邮电大学学报,2003,26(z1):112-114. 被引量：8
5董远,胡光锐,孙放.一种基于分形理论的语音分割新方法[J].上海交通大学学报,1998,32(4):97-99. 被引量：2
6梅晓丹,孙圣和.基于小波变换的静音与语音分割新算法[J].哈尔滨工业大学学报,2002,34(3):408-411. 被引量：12
7任新社,缪华,马青玉.基于改进特征值的语音分割算法研究[J].南京师范大学学报（工程技术版）,2011,11(3):73-77. 被引量：1
8周曦,戴蓓蒨,陈雁翔,李辉.基于纯度和BBN算法的无监督的话者聚类[J].模式识别与人工智能,2005,18(4):486-490. 被引量：2
9董远,胡光锐.多重分形维数在语音分割和语音识别中的应用[J].上海交通大学学报,1999,33(11):1406-1408. 被引量：3
10钟金宏,杨善林,蒋俊杰.汉语连续语音中声调识别的特殊性研究[J].小型微型计算机系统,2002,23(4):470-473. 被引量：2

合肥工业大学学报（自然科学版）

2010年第5期

浏览历史

内容加载中请稍等...

一种基于说话者的无监督语音分割算法被引量：3

参考文献8

二级参考文献11

共引文献4

同被引文献17

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

一种基于说话者的无监督语音分割算法 被引量：3

参考文献8

二级参考文献11

共引文献4

同被引文献17

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

一种基于说话者的无监督语音分割算法被引量：3