摘要
本文对言语识别中的声学要素从时域和频域的角度进行探讨,旨在为人工耳蜗编码策略的改善提供理论依据。声码器技术被用于一系列的实验以确定时域和频域信息对言语识别和汉语四声识别的相互作用。频域信息是由声码器中的频道数来决定,而时域信息则是由声码器的低通滤波器的截止频率来决定。听力正常成人参加了各项感知试验。结果表明,时域和频域信息都对音素识别很重要。在安静环境下,辅音和元音识别率分别在8和12频道及16Hz和4Hz的低通截止频率时达到平台成绩。在噪声环境下,元音识别受益于增高的频道数。汉语四声的识别需要256Hz的低通截止频率才达到平台成绩,这一频率比英语音素识别所需的时域信息高得多。声调识别率在本研究中最高频道数12时仍未见饱和。为了研究细微结构和时域包络对四声识别的相对重要性,我们用声嵌合技术将不同声调信号的时域包络和细微结构进行对换。感知实验结果表明,声调识别主要取决于细微结构,这一点与音乐感知的结果类似,而不象言语识别,后者主要依赖于时域包络信息。因此,增加人工耳蜗系统中有效的频道数将有助于尤其是噪声环境下的言语识别。将人工耳蜗刺激中提供更多的细微结构信息可能会提高患者声调识别的成绩。
The present study explores the temporal and spectral cues for speech recognition in an attempt to provide information for improving the speech processing strategies in cochlear implant systems. A noise-excited voeoder was used in a series of experiments to determine the relative contribution of temporal and spectral cues to phoneme recognition and lexical tone recognition. Spectral information was controlled by varying the number of channels and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors. Normalhearing adult subjects participated in the perceptual tests. The results demonstrated that both temporal and spectral cues are important for phoneme recognition in quiet and in noise. The plateau performance for consonant and vowel recognition in quiet was reached when the number of channels was 8 and 12, respectively and the lowpass cutoff frequency was 16 and 4 Hz, respectively. In noise conditions, vowel recognition benefited from increased spectral resolution. For Mandarin Chinese tone recognition, the lowpass cutoff frequency required for asymptotic performance was 256 Hz, much higher than that required for English phoneme recognition. Tone recognition performance had not yet reached plateau when 12 chan- nels, the highest in this study, were used. To study the relative importance of fine structure and temporal envelope in lexical tone recognition, a separate experiment using the auditory chimera technique was carded out. The perceptual results demonstrated that tone recognition relies more on the fine structure as does melody perception rather than on the temporal envelope as does English speech perception. Therefore, to improve speech recognition, especially in noise, efforts should be concentrated on providing more effective channels in the cochlear implant systems. Lexical tone recognition could benefit from fine structure information presented in the cochlear implant stimulations.
出处
《中华耳科学杂志》
CSCD
2006年第4期335-342,共8页
Chinese Journal of Otology
基金
美国NIH(F32-DC00470
RO1-DC03808
R03-DC006161.)
俄亥俄大学研究基金。
关键词
人工耳蜗
言语识别
声调识别
时域信息
频域信息
Cochlear implant
Speech perception
Tone perception
Temporal cues
Spectral cues