Speech endpoint detection in low-SNRs environment based on perception spectrogram structure boundary parameter 被引量：9

Speech endpoint detection in low-SNRs environment based on perception spectrogram structure boundary parameter

导出

摘要 The Perception Spectrogram Structure Boundary（PSSB）parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out.Then the two-dimensional enhancement is performed upon the sound spectrogram according to the difference between the determinacy distribution characteristic of speech and the random distribution characteristic of noise.Finally a decision for endpoint was made by the PSSB parameter.Experimental results show that,in a low SNR environment from-10 dB to 10 dB,the algorithm proposed in this paper may achieve higher accuracy than the extant endpoint detection algorithms.The detection accuracy of 75.2%can be reached even in the extremely low SNR at-10 dB.Therefore it is suitable for speech endpoint detection in low-SNRs environment. The Perception Spectrogram Structure Boundary（PSSB）parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out.Then the two-dimensional enhancement is performed upon the sound spectrogram according to the difference between the determinacy distribution characteristic of speech and the random distribution characteristic of noise.Finally a decision for endpoint was made by the PSSB parameter.Experimental results show that,in a low SNR environment from-10 dB to 10 dB,the algorithm proposed in this paper may achieve higher accuracy than the extant endpoint detection algorithms.The detection accuracy of 75.2%can be reached even in the extremely low SNR at-10 dB.Therefore it is suitable for speech endpoint detection in low-SNRs environment.

作者 WU Di ZHAO Heming HUANG Chengwei XIAO Zhongzhe ZHANG Xiaojun XU Yishen TAO Zhi

机构地区 College of Physics School of Electronic Information

出处《Chinese Journal of Acoustics》 2014年第4期428-440,共13页 声学学报（英文版）

基金 supported by the National Natural Science Foundation of China.(61071215,61271359,61372146)

关键词 Speech endpoint detection in low-SNRs environment based on perception spectrogram structure boundary parameter

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献1

1陈振标,徐波.基于子带能量特征的最优化语音端点检测算法研究[J].声学学报,2005,30(2):171-176. 被引量：22

二级参考文献12

1果永振,何遵文.一种多特征语音端点检测算法及实现[J].通信技术,2003,36(1):8-10. 被引量：8
2Wu G D, Lin C T. Word boundary detection with mel-scale frequency bank in noisy environment. IEEE Transactions on Speech and Audio Processing, 2000; 8(5): 541-554.
3Ramalingam Hariharan et al. Robust end of utterance detection for real-time speech recognition applications. In Proc. ICASSP'2001.
4CHEN Shaoyan et al. A robust method based on likelihood estimation for speech signal detection. International Symposium on Chinese Spoken Language Processing, 2000.
5HUANG Liangsheng et al. A novel approach to robust speech endpoint detection in car environments. International Conference on Acoustics Speech and Signal Processing, 2000.
6Johan de Veth e~ al. Comparison of channel normalization techniques for automatic speech recognition over the phone. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP96), 1996; 4:2332-2335.
7Li Qi et al. A Robust real-time endpoint detector with energy normalization for ASR in adverse environments. In Proc. ICASSP'2001, Salt Lake City, 2001.
8Canny J. A computational approach to edge detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986; 8:679-698.
9Petrou M et al. Optimal edge detectors for ramp edge.IEEE Trans on Pattern Analysis and Machine Intelligence, 1991; 13:483-491.
10胡光锐,韦晓东.基于倒谱特征的带噪语音端点检测[J].电子学报,2000,28(10):95-97. 被引量：70

共引文献21

1王秀丽,王树勋,林琳.基于扩展谱相减与SAP的带噪语音端点检测[J].吉林大学学报（信息科学版）,2006,24(4):351-357. 被引量：2
2国雁萌,付强,颜永红.复杂噪声环境中的语音端点检测[J].声学学报,2006,31(6):549-554. 被引量：17
3王欢良,韩纪庆,李海峰.基于特征似然度加权和维数缩减的Robust语音端点检测[J].声学学报,2007,32(1):62-68. 被引量：7
4郑展恒,曾庆宁,张少兵.一种语音端点检测方法的研究[J].桂林电子科技大学学报,2008,28(1):23-26. 被引量：3
5刘泽琛.语音端点检测的常用方法及改进[J].高等函授学报（自然科学版）,2008,21(3):52-53. 被引量：4
6李志忠,滕光辉.子带特征参数在家禽应激发声识别中的应用[J].农业机械学报,2009,40(3):143-146. 被引量：13
7李晋,刘甫,王玲,许慧燕.改进的语音端点检测技术[J].计算机工程与应用,2009,45(24):133-135. 被引量：9
8康广玉,郭世泽,孙圣和.基于子带能量的GMM含噪语音分类算法[J].仪器仪表学报,2009,30(9):1950-1955. 被引量：2
9姜占才,孙燕,王得芳.基于复合能量和自适应阈值的语音端点检测[J].计算机工程与科学,2010,32(4):136-138. 被引量：1
10周夕良.基于不同背景噪声的联合检测技术[J].计算机与现代化,2011(3):153-155.

同被引文献53

1徐国庆,杨丹,王彬洁,文俊浩.乐音识别方法及应用[J].计算机应用,2005,25(4):968-970. 被引量：9
2陈振标,徐波.基于子带能量特征的最优化语音端点检测算法研究[J].声学学报,2005,30(2):171-176. 被引量：22
3刘安定,肖先勇,邓武军.基于离散余弦变换和小波变换的电能质量扰动信号检测方法[J].电网技术,2005,29(10):70-74. 被引量：35
4马义德,袁敏,齐春亮,刘悦,刘映杰.基于PCNN的语谱图特征提取在说话人识别中的应用[J].计算机工程与应用,2005,41(20):81-84. 被引量：23
5潘凌云,孙达传,吴美朝.语音识别中基于语谱图的语音音素分割方法[J].杭州大学学报（自然科学版）,1995,22(1):42-46. 被引量：7
6李超,熊璋,薛玲,刘云.一种阈值自适应调整的实时音频分割方法[J].北京航空航天大学学报,2005,31(12):1317-1321. 被引量：2
7张志敏,郭英,王博.一种基于倒谱特征的语音端点检测改进算法[J].电声技术,2006,30(4):39-42. 被引量：8
8刘淑华,胡强,覃团发,梁琳.基于自相关函数最大值的语音端点检测方法[J].电声技术,2006,30(12):47-50. 被引量：10
9GUO Yanmeng FU Qiang YAN Yonghong.Speech endpoint detection in real noise environments[J].Chinese Journal of Acoustics,2007,26(1):39-48. 被引量：5
10刘华平,李昕,郑宇,徐柏龄,姜宁.一种改进的自适应子带谱熵语音端点检测方法[J].系统仿真学报,2008,20(5):1366-1371. 被引量：26

引证文献9

1梁春燕,杨琳,周若华,颜永红.韵律特征在概率线性判别分析说话人确认中的应用[J].声学学报,2015,40(1):28-33. 被引量：6
2LUO Yaqin,WU Xiaopei,L Zhao,PENG Kui,GUI Yajun.A recursive calculating algorithm for higher-order cumulants over sliding window and its application in speech endpoint detection[J].Chinese Journal of Acoustics,2015,34(4):436-449. 被引量：5
3章小兵,李燕萍,王双杰.基于改进HHT的语音端点检测[J].计算机工程,2016,42(6):171-174. 被引量：4
4张芝旖,姚恩涛,石玉.小波分析和MFCC融合的声音信号端点检测算法[J].电子测量技术,2016,39(7):62-66. 被引量：8
5刘伟,谢建志.语音合成系统中语音库样本能量均衡方法研究[J].信号处理,2017,33(2):229-235. 被引量：4
6叶旸,张雪凡,刘源,王臣,黄庆.基于智能可穿戴设备的乐音对比算法[J].应用科学学报,2017,35(6):706-716.
7郑艳,高爽.基于自适应门限的分形维数语音端点检测[J].东北大学学报（自然科学版）,2020,41(1):7-11. 被引量：3
8程涛,姚万华,姚克,栗高尚.基于深度学习的噪声背景通信信号端点检测[J].无线互联科技,2023,20(2):13-19.
9魏莹,王双维,潘迪,张玲,许廷发,梁士利.宽窄带语谱图融合分带投影的特定人汉语词汇识别[J].计算机科学,2016,43(S2):215-219. 被引量：1

二级引证文献31

1卢洵波,李昕.特征融合的VAD方法在语音识别系统中的应用[J].电子测量技术,2020(7):129-136. 被引量：2
2侯雷静,郭婷婷,孙燕,齐英杰,应冬文,唐闽,颜永红.面向心音分割的个性化高斯混合建模方法[J].声学学报,2019,44(1):20-27. 被引量：7
3吴兴铨,周金治.基于改进小波变换的语音基音周期检测[J].自动化仪表,2017,38(6):67-70. 被引量：7
4何昱超,孙维方,陈彬强,姚斌,曹新城.复小波域影致留形及金属铣削表面质量评估[J].国外电子测量技术,2017,36(5):90-93.
5酆勇,熊庆宇,石为人,曹俊华.深度非线性度量学习在说话人确认中的应用[J].声学学报,2018,43(1):112-120. 被引量：3
6李颀,白雨尼,王丹聪.基于小波包分析的玻璃破碎声音识别系统设计[J].计算机测量与控制,2018,26(1):168-172. 被引量：6
7贺靖康,李建文.一种改进的皮肤听声语音信号处理系统[J].江苏科技大学学报（自然科学版）,2017,31(6):825-829. 被引量：1
8仲伟峰,方祥,范存航,温正棋,陶建华.深浅层特征及模型融合的说话人识别[J].声学学报,2018,43(2):263-272. 被引量：13
9陈铭钧,陶凌,李富贵,刘九畅.HHT在白细胞亚群分类算法中的应用[J].南昌大学学报（理科版）,2018,42(1):72-75.
10张琳,吴建明.基于计算机技术的钢琴音色识别与电子合成系统设计[J].自动化与仪器仪表,2018,0(10):79-82. 被引量：6

1Emma Rodero.The Perception of a Broadcasting Voice[J].US-China Education Review(A),2013,3(4):225-230.
2Lawrence.MIRAGEOM10音箱[J].高保真音响,1999(1):43-45.
3徐春光,谢维信.一种基于互局部化Wigner-Ville分布的瞬时频率估计[J].电子科学学刊,2000,22(5):753-758. 被引量：2
4Zongbo Xie Jiuchao Feng.Speech Enhancement via Bayesian Multi-solution Shrinker[J].中国电子商情（通信市场）,2013(6):89-94.
5同一地震不同震级的原因[J].城市与减灾,2005(2):43-43.
6Ma, Shiwei, Deng, Jiamei, Cao, Jialin.Four-Parameter Signal Decomposition and Related Adaptive Time-Frequency Distribution[J].Journal of Systems Engineering and Electronics,2000,11(3):14-21. 被引量：2
7香港新品Perception Digital PD-220[J].科学时代,2004(08X):54-54.
8张洪魁,沈启兴,吴卫,赵玉林,毛桐恩.An　approach　on　dynamic　earthquake　prediction　by　georesistivitymeasurements[J].Acta Seismologica Sinica(English Edition),1996,9(3):79-86.
9De-Xiang Zhang Xiao-Pei Wu Zhao Lv.Speech Endpoint Detection in Noisy Environments Using EMD and Teager Energy Operator[J].Journal of Electronic Science and Technology,2010,8(2):183-186. 被引量：4
10王六桥.Preliminary　discussion　on　earthquake　predic-tion　research　Whether　it　relies　on　experience　or　determinacy[J].Acta Seismologica Sinica(English Edition),1997,10(1):120-123.

Chinese Journal of Acoustics

2014年第4期

浏览历史

内容加载中请稍等...