言语识别中的时域及频域信息被引量：22

Temporal and spectral cues for speech recognition

下载PDF

导出

摘要本文对言语识别中的声学要素从时域和频域的角度进行探讨,旨在为人工耳蜗编码策略的改善提供理论依据。声码器技术被用于一系列的实验以确定时域和频域信息对言语识别和汉语四声识别的相互作用。频域信息是由声码器中的频道数来决定,而时域信息则是由声码器的低通滤波器的截止频率来决定。听力正常成人参加了各项感知试验。结果表明,时域和频域信息都对音素识别很重要。在安静环境下,辅音和元音识别率分别在8和12频道及16Hz和4Hz的低通截止频率时达到平台成绩。在噪声环境下,元音识别受益于增高的频道数。汉语四声的识别需要256Hz的低通截止频率才达到平台成绩,这一频率比英语音素识别所需的时域信息高得多。声调识别率在本研究中最高频道数12时仍未见饱和。为了研究细微结构和时域包络对四声识别的相对重要性,我们用声嵌合技术将不同声调信号的时域包络和细微结构进行对换。感知实验结果表明,声调识别主要取决于细微结构,这一点与音乐感知的结果类似,而不象言语识别,后者主要依赖于时域包络信息。因此,增加人工耳蜗系统中有效的频道数将有助于尤其是噪声环境下的言语识别。将人工耳蜗刺激中提供更多的细微结构信息可能会提高患者声调识别的成绩。 The present study explores the temporal and spectral cues for speech recognition in an attempt to provide information for improving the speech processing strategies in cochlear implant systems. A noise-excited voeoder was used in a series of experiments to determine the relative contribution of temporal and spectral cues to phoneme recognition and lexical tone recognition. Spectral information was controlled by varying the number of channels and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors. Normalhearing adult subjects participated in the perceptual tests. The results demonstrated that both temporal and spectral cues are important for phoneme recognition in quiet and in noise. The plateau performance for consonant and vowel recognition in quiet was reached when the number of channels was 8 and 12, respectively and the lowpass cutoff frequency was 16 and 4 Hz, respectively. In noise conditions, vowel recognition benefited from increased spectral resolution. For Mandarin Chinese tone recognition, the lowpass cutoff frequency required for asymptotic performance was 256 Hz, much higher than that required for English phoneme recognition. Tone recognition performance had not yet reached plateau when 12 chan- nels, the highest in this study, were used. To study the relative importance of fine structure and temporal envelope in lexical tone recognition, a separate experiment using the auditory chimera technique was carded out. The perceptual results demonstrated that tone recognition relies more on the fine structure as does melody perception rather than on the temporal envelope as does English speech perception. Therefore, to improve speech recognition, especially in noise, efforts should be concentrated on providing more effective channels in the cochlear implant systems. Lexical tone recognition could benefit from fine structure information presented in the cochlear implant stimulations.

作者徐立

机构地区 School of Hearing

出处《中华耳科学杂志》 CSCD 2006年第4期335-342,共8页 Chinese Journal of Otology

基金美国NIH(F32-DC00470 RO1-DC03808 R03-DC006161.) 俄亥俄大学研究基金。

关键词人工耳蜗言语识别声调识别时域信息频域信息 Cochlear implant Speech perception Tone perception Temporal cues Spectral cues

分类号 R339.16 [医药卫生—人体生理学] H018.4 [语言文字—语言学]

引文网络
相关文献

参考文献55

1[2]Friesen LM,Shannon RV,Baskent D,et al.Speech recognition in noise as a function of the number of spectral channels:Comparison of acoustic hearing and cochlear implants.J Acoust Soc Am,2001,110:1150-1163.
2[3]Shannon RV.Multichannel electrical stimulation of the auditory nerve in man.I.Basic psychophysics.Hear Res,1983,11:157-189.
3[4]Shannon RV.Temporal modulation transfer functions in patients with cochlear implants.J Acoust Soc Am,1992,91:2156-2164.
4[5]van den Honert C,Stypulkowski PH.Physiological properties of the electrically stimulated auditory nerve.II.Single fiber recordings.Hear Res,1984,14(3):225-243.
5[6]Rubinstein JT,Hong R.Signal coding in cochlear implants:Exploiting stochastic effects of electrical stimulation.Ann Otol Rhinol Laryngol,2003,112:14-19.
6[7]Skinner MW,Arndt PL,Staller SJ.Nucleus 24 advanced encoder conversion study:Performance vs preference.Ear Hear,2002,23:2S-25S.
7[8]Villchur E.Electronic models to simulate the effect of sensory distortions onspeech perception by the deaf.J Acoust Soc Am,1977,62:665-674.
8[9]ter Keurs M,Festen JM,Plomp R.Effect of spectral envelope smearing on speech reception.I.J Acoust Soc Am,1992,91:2872-2880.
9[10]ter Keurs M,Festen JM,Plomp R.Effect of spectral envelope smearing on speech reception.Ⅱ.J Acoust Soc Am,1993,93:1547-1552.
10[11]Baer T,Moore BCJ.Effects of spectral smearing on the intelligibility of sentences in noise.J Acoust Soc Am,1993,94:1229-1241.

共引文献3

1刘海红,陈雪清,刘博,孔颖,郑妍.人工耳蜗植入儿童听力言语辨别能力研究[J].听力学及言语疾病杂志,2008,16(3):190-192. 被引量：4
2亓贝尔,刘博.影响人工耳蜗植入者声调识别的因素[J].听力学及言语疾病杂志,2010,18(5):512-514. 被引量：4
3刘建菊,孙喜斌.人工耳蜗植入儿童术后声调识别能力跟踪评估[J].中国康复理论与实践,2012,18(4):365-368. 被引量：3

同被引文献172

1LEE Chaoyang.Mandarin Chinese Tone Recognition with an Artificial Neural Network[J].Journal of Otology,2006,1(1):30-34. 被引量：3
2吴宗济.赵元任先生在汉语声调研究上的贡献[J].清华大学学报（哲学社会科学版）,1996,11(3):60-65. 被引量：41
3孔江平.藏语（拉萨话）声调感知研究[J].民族语文,1995(3):56-64. 被引量：41
4刘博.人工耳蜗植入者的声调研究与评价[J].中国医学文摘（耳鼻咽喉科学）,2009,24(5):260-261. 被引量：5
5孙喜斌,郭占东.聋幼儿听力语言康复评估[J].现代康复,1999,3(11):1288-1289. 被引量：30
6徐士林,应勇.汉语声调的多特征模糊识别方法[J].模式识别与人工智能,1994,7(1):60-65. 被引量：4
7田鸿,李胜利.失语症患者的四声检查结果分析[J].中国康复,1996,11(2):57-58. 被引量：2
8曹永茂,龙墨,银力,华清泉,陶泽璋,吴展元,廖华,黄治物,李骏,常伟.不同听觉干预手段对聋儿声调辨别能力的影响[J].中国听力语言康复科学杂志,2007,36(3):24-26. 被引量：9
9周丽君,曹思远,王琦,陈滨,梁巍,孙喜斌.92例人工耳蜗术后不同阶段康复评估效果分析[J].中国听力语言康复科学杂志,2007,36(3):41-44. 被引量：28
10王锦玲,石力,高磊,谢娟,韩丽萍.单侧听神经病11例听力学特点分析[J].临床耳鼻咽喉头颈外科杂志,2007,21(10):436-440. 被引量：10

引证文献22

1Mao Yitao,Xu Li.MUSIC AND COCHLEAR IMPLANTS[J].Journal of Otology,2013,8(1):32-38. 被引量：4
2陈雪清,刘海红,刘博,赵吉特,李靖.时域和频域信息对汉语普通话声调识别的影响[J].中国听力语言康复科学杂志,2008,37(5):18-20. 被引量：8
3周晓琴,陈浩,钱宇虹,郭梦和.采用Cool Edit Pro 2.0软件分析汉语普通话单音节词四声的时域及频域[J].中国组织工程研究与临床康复,2008,12(35):6827-6830. 被引量：6
4陈雪清,刘海红.语前聋患者人工耳蜗植入后声调识别能力研究[J].听力学及言语疾病杂志,2010,18(1):55-56. 被引量：14
5冀飞,陈艾婷,赵阳,郗昕,李佳楠,王秋菊,李兴启,韩东一,杨仕明.听神经病患者普通话单音节识别错误模式分析[J].中华耳鼻咽喉头颈外科杂志,2010,45(4):277-281. 被引量：3
6王硕,RobertMannell,张华.汉语声调在国内外人工耳蜗相关领域的研究现状[J].中华耳鼻咽喉头颈外科杂志,2011,46(2):164-167. 被引量：9
7崔丽丽,黄昭鸣,杜晓新.植入人工耳蜗与配戴助听器儿童背景噪声下的声调识别[J].中国听力语言康复科学杂志,2011(2):39-42. 被引量：1
8冯艳梅,徐立,周宁,杨光,殷善开.听力正常人对正弦波汉语普通话的识别[J].中华耳科学杂志,2012,10(2):178-181.
9崔丹默,王顺成,石颖,刘婷,杨宜林,刘莎,孔颖,李永新.双侧人工耳蜗植入术后言语识别效果的评估[J].中华耳科学杂志,2013,11(2):181-184. 被引量：9
10徐立,周宁.人工耳蜗植入与声调语言(英文)[J].中国医学文摘（耳鼻咽喉科学）,2013,28(4):196-199. 被引量：6

二级引证文献83

1解鸿雁,王健英,李国艳,田野.音乐训练对提高听障大学生内心听觉的影响研究[J].天津音乐学院学报,2022(4):109-115.
2高霖普.综合护理干预对语前聋患儿人工耳蜗植入术后听觉言语康复效果的临床观察[J].青海医药杂志,2019,0(12):21-23. 被引量：1
3冀飞,陈艾婷,赵阳,周其友,郗昕.听神经病患者的单音节识别率与言语声强的函数关系分析[J].中国听力语言康复科学杂志,2010(4):23-26. 被引量：6
4亓贝尔,刘博.影响人工耳蜗植入者声调识别的因素[J].听力学及言语疾病杂志,2010,18(5):512-514. 被引量：4
5王硕,RobertMannell,张华.汉语声调在国内外人工耳蜗相关领域的研究现状[J].中华耳鼻咽喉头颈外科杂志,2011,46(2):164-167. 被引量：9
6崔丽丽,黄昭鸣.人工耳蜗植入儿童声调的识别及发声能力个案研究[J].听力学及言语疾病杂志,2011,19(3):253-255. 被引量：1
7亓贝尔,刘博.人工耳蜗言语处理方案的研究进展[J].临床耳鼻咽喉头颈外科杂志,2012,26(1):44-47. 被引量：5
8亓贝尔,李晓芳,董瑞娟,刘博.语后聋人工耳蜗植入者的心理健康状况分析[J].听力学及言语疾病杂志,2012,20(1):40-43. 被引量：12
9王硕,Robert Mannell,Philip Newall,董瑞娟,李靖,张华,陈雪清,韩德民.共振峰信息在汉语声调感知中的作用[J].中国耳鼻咽喉头颈外科,2012,19(1):8-11. 被引量：7
10王硕,Robert Mannell,Philip Newall,刘博,张华,韩德民.频域信息在感音神经性听力损失患者普通话声调感知中的作用[J].中华耳鼻咽喉头颈外科杂志,2012,47(2):122-126. 被引量：2

1李春晓,潘翔,刘琚,聂开宝.频域和时域信息对连续交替取样策略汉语声调识别的贡献[J].生物医学工程学杂志,2006,23(1):41-44. 被引量：2
2杨舒.心理语言学视角下的广告语言[J].海外英语,2013(16):274-276. 被引量：1
3张梦超,刘巧云,卢海丹,黄昭鸣,杜晓新,孙喜斌.正常成人低通滤波汉语清塞音的识别研究[J].听力学及言语疾病杂志,2012,20(2):105-107.
4郭梦和,王锦玲,吴进根.正常豚鼠听性脑干反应的时域及频域分析[J].第四军医大学学报,1990,11(4):257-260. 被引量：2
5亓贝尔,刘博,刘莎,刘海红,董瑞娟,张宁,龚树生.人工耳蜗电极植入部位对人工耳蜗使用者言语理解影响的研究[J].临床耳鼻咽喉头颈外科杂志,2011,25(10):441-444. 被引量：9
6魏朝刚,曹克利,王直中,曾凡钢.多通道人工耳蜗使用者电刺激速率辨别与声调识别的关系[J].中华耳鼻咽喉科杂志,1999,34(2):84-88. 被引量：14
7陈又圣.基于Matlab的电子耳蜗信号采集研究[J].深圳信息职业技术学院学报,2016,14(3):6-10. 被引量：5
8周晓琴,陈浩,钱宇虹,郭梦和.采用Cool Edit Pro 2.0软件分析汉语普通话单音节词四声的时域及频域[J].中国组织工程研究与临床康复,2008,12(35):6827-6830. 被引量：6
9注意力集中耳鸣会更重[J].中老年保健,2008(4):7-7.
10孙旭东.论音乐感知与对外汉语声调教学[J].通俗歌曲,2016,0(3X):139-139.

中华耳科学杂志

2006年第4期

浏览历史

内容加载中请稍等...

言语识别中的时域及频域信息被引量：22

参考文献55

共引文献3

同被引文献172

引证文献22

二级引证文献83

相关作者

相关机构

相关主题

浏览历史

言语识别中的时域及频域信息 被引量：22

参考文献55

共引文献3

同被引文献172

引证文献22

二级引证文献83

相关作者

相关机构

相关主题

浏览历史

言语识别中的时域及频域信息被引量：22