The present study was designed to examine speech recognition in patients with sensorineural hearing loss when the temporal and spectral information in the speech signals were co-varied. Four subjects with mild to mode...The present study was designed to examine speech recognition in patients with sensorineural hearing loss when the temporal and spectral information in the speech signals were co-varied. Four subjects with mild to moderate sensorineural hearing loss were recruited to participate in consonant and vowel recognition tests that used speech stimuli processed through a noise-excited vocoder. The number of channels was varied between 2 and 32, which defined spectral information. The lowpass cutoff frequency of the temporal envelope extractor was varied from 1 to 512 Hz, which defined temporal information. Results indicate that performance of subjects with sen-sorineural hearing loss varied tremendously among the subjects. For consonant recognition, patterns of relative contributions of spectral and temporal information were similar to those in normal-hearing subjects. The utility of temporal envelope information appeared to be normal in the hearing-impaired listeners. For vowel recognition, which depended predominately on spectral information, the performance plateau was achieved with numbers of channels as high as 16-24, much higher than expected, given that the frequency selectivity in patients with sensorineural hearing loss might be compromised. In order to understand the mechanisms on how hearing-impaired listeners utilize spectral and temporal cues for speech recognition, future studies that involve a large sample of patients with sensorineural hearing loss will be necessary to elucidate the relationship between frequency selectivity as well as central processing capability and speech recognition performance using vocoded signals.展开更多
Mandarin Chinese tone patterns vary in one of the four ways, i.e, (1) high level; (2) rising; (3) low falling and rising; and (4) high falling. The present study is to examine the efficacy of an artificial neural netw...Mandarin Chinese tone patterns vary in one of the four ways, i.e, (1) high level; (2) rising; (3) low falling and rising; and (4) high falling. The present study is to examine the efficacy of an artificial neural network in recognizing these tone patterns. Speech data were recorded from 12 children (3-6 years of age) and 15 adults. All subjects were native Mandarin Chinese speakers. The fundamental frequencies (F0) of each monosyllabic word of the speech data were extracted with an autocorrelation method. The pitch data(i.e., the F0 contours) were the inputs to a feed-forward backpropagation artificial neural network. The number of inputs to the neural network varied from 1 to 16 and the hidden layer of the network contained neurons that varied from 1 to 16 in number. The output of the network consisted of four neurons representing the four tone patterns of Mandarin Chinese. After being trained with the Levenberg-Marquardt optimization, the neural network was able to successfully classify the tone patterns with an accuracy of about 90% correct for speech samples from both adults and children. The artificial neural network may provide an objective and effective way of assessing tone production in prelingually-deafened children who have received cochlear implants.展开更多
基金supported in part by NIH/NIDCD grants R03-DC006161 and R15-DC009504
文摘The present study was designed to examine speech recognition in patients with sensorineural hearing loss when the temporal and spectral information in the speech signals were co-varied. Four subjects with mild to moderate sensorineural hearing loss were recruited to participate in consonant and vowel recognition tests that used speech stimuli processed through a noise-excited vocoder. The number of channels was varied between 2 and 32, which defined spectral information. The lowpass cutoff frequency of the temporal envelope extractor was varied from 1 to 512 Hz, which defined temporal information. Results indicate that performance of subjects with sen-sorineural hearing loss varied tremendously among the subjects. For consonant recognition, patterns of relative contributions of spectral and temporal information were similar to those in normal-hearing subjects. The utility of temporal envelope information appeared to be normal in the hearing-impaired listeners. For vowel recognition, which depended predominately on spectral information, the performance plateau was achieved with numbers of channels as high as 16-24, much higher than expected, given that the frequency selectivity in patients with sensorineural hearing loss might be compromised. In order to understand the mechanisms on how hearing-impaired listeners utilize spectral and temporal cues for speech recognition, future studies that involve a large sample of patients with sensorineural hearing loss will be necessary to elucidate the relationship between frequency selectivity as well as central processing capability and speech recognition performance using vocoded signals.
文摘Mandarin Chinese tone patterns vary in one of the four ways, i.e, (1) high level; (2) rising; (3) low falling and rising; and (4) high falling. The present study is to examine the efficacy of an artificial neural network in recognizing these tone patterns. Speech data were recorded from 12 children (3-6 years of age) and 15 adults. All subjects were native Mandarin Chinese speakers. The fundamental frequencies (F0) of each monosyllabic word of the speech data were extracted with an autocorrelation method. The pitch data(i.e., the F0 contours) were the inputs to a feed-forward backpropagation artificial neural network. The number of inputs to the neural network varied from 1 to 16 and the hidden layer of the network contained neurons that varied from 1 to 16 in number. The output of the network consisted of four neurons representing the four tone patterns of Mandarin Chinese. After being trained with the Levenberg-Marquardt optimization, the neural network was able to successfully classify the tone patterns with an accuracy of about 90% correct for speech samples from both adults and children. The artificial neural network may provide an objective and effective way of assessing tone production in prelingually-deafened children who have received cochlear implants.