期刊文献+

不同频段时域包络在普通话语句识别中的相对权重

The Relative Weight of Temporal Envelope Cues Extracted with Different Bandwidths in Different Frequency Regions for Mandarin Sentence Recognition
下载PDF
导出
摘要 目的探讨不同带宽的时域包络(temporal envelope,TE)信息中不同频段在汉语普通话语句识别中的相对权重。方法以1.5和3个等效矩形带宽(equivalent rectangular bandwidth,ERB)对傅里叶变换处理的时域信息进行切割,通过Hilbert转换提取TE信息并分为5个相邻频段,测试听力正常者在不同条件下的言语识别率,并通过最小二乘法计算每个频段在言语识别中的相对权重。结果单频段TE信息下的言语识别率为3.90%~4.80%,双频段TE信息下的言语识别率为22.60%~85.40%,全频段TE信息下的言语识别率为100%。在1.5ERB提取TE信息识别中,频段1~5的权重分别为0.28、0.08、0.21、0.25和0.18;在3ERB提取TE信息识别中,频段1~5的权重分别为0.29、0.05、0.32、0.21和0.14。结论随着TE信息提取宽度的增大,相同语句的言语识别率总体上呈现降低的趋势,可能是由于TE信息减少导致;在不同带宽提取的TE信息中,低频段(80~585 Hz)的TE信息在言语识别中的权重均较高,可能是因为此频段包含基频信息。 Objective To investigate the relative weight of temporal envelope(TE)cues extracted with different bandwidths in different frequency regions for Mandarin sentence recognition.Methods The material was processed with Fourier transform and cut by 1.5 and 3 equivalent rectangular bandwidths(ERB).TE information was then extracted using Hilbert decomposition,and different width bands allocated into 5 frequency regions respectively.The speech recognition scores were obtained under different test conditions from subjects with normal hearing.In the end,we used the least square approach to calculate the relative weight of TE information in different frequency region for Mandarin sentence recognition.Results The score ranged from 3.90%to 4.80%when stimulus were presented with one frequency region and raised to 22.60%~85.40%when presented with two frequency regions.For full region,the score reached 100%.The mean weight of TE cues extracted with 1.5 ERB in frequency region 1~5 for Mandarin sentence recognition were 0.28,0.08,0.21,0.25,and 0.18,respectively.And mean weight of TE cues extracted with 3 ERB in frequency region 1~5 for Mandarin sentence recognition were 0.29,0.05,0.32,0.21,and 0.14,respectively.Conclusion As the width of TE information extraction increases,the recognition scores of the same sentence generally show a decreasing tendency,possibly due to the reduction of envelope information.Under the different test conditions,the low-frequency region(80-502 Hz)of TE cues extracted with different equivalent rectangular bandwidths,showed a higher mean weight,suggesting a considerable impact on Mandarin sentence recognition for this region contains fundamental frequency information.
作者 文锦昌 李可意 郭洋 夏俍 肖丽丽 柳铖棋 郑重 Wen Jinchang;Li Keyi;Guo Yang;Xia Liang;Xiao Lili;Liu Chengqi;Zheng Zhong(Department of Otolaryngology-Head and Neck Surgery,Shanghai Jiaotong University Affiliated Sixth People’s Hospital,Shanghai,200233,China;Shanghan key Laboratory of sleep Disovdered Breathing,Shanghai,200233,China;不详)
出处 《听力学及言语疾病杂志》 CAS CSCD 北大核心 2021年第6期599-603,共5页 Journal of Audiology and Speech Pathology
基金 国家自然科学基金(81771015) 上海市科学技术委员会(18DZ2260200) 国家自然科学基金国际(地区)合作与交流项目(81720108010)。
关键词 等效矩形带宽 时域包络 普通话 频段 相对权重 Equivalent rectangular bandwidth Temporal envelope Mandarin Frequency region Relative weight
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部