摘要
背景:目前国内使用人工耳蜗的言语编码方案是基于西方语言特点设计,未考虑到汉语语音特点,故其声调识别效果欠佳。目的:对汉语普通话单音节词声调的音素从时域和频域进行分析,探讨影响汉语语音声调识别的因素。设计、时间和地点:多样本观察,于2008-04/06在南方医科大学珠江医院耳鼻咽喉头颈外科听力中心完成。材料:语音材料取自《聋儿听觉言语康复评估方法指导手册》,选用同音单音节词声调识别部分,共10个音节,4种声调,合计40个词。方法:采用Cool Edit Pro 2.0软件对汉语单音节词进行时域分析。用快速傅里叶转换行单音节词的幅值谱分析,用有限冲激响应数字滤波器分别行0.5kHz高通,0.5~4.0kHz带通和4kHz低通、2kHz低通数字滤波。主要观察指标:各单音节词声调包络线特征及音长特征及频域分析结果。结果:时域分析发现,波形图和包络线显示不同单音节词无论声母韵母是否相同,只要声调相同其基频包络就具有高度相似性,说明时域信息对汉语四声的识别起很重要的作用。频域分析显示:汉语单音节词主要由F0、F1、F2和F3组成。其中F0是基频,F1和F2是F0的二倍频和三倍频。F3是语音的高频成分,对言语的清晰度起重要作用。结论:汉语四声辨别主要在于时域和频域信息。
BACKGROUND: Speech coding program of artificial cochlea in China is based on foreign language features. So its tone recognition is not favorable. OBJECTIVE: To investigate the factors affecting Chinese speech tone recognition and analyze the Mandarin monosyllabic word phoneme from time and frequency domains. DESIGN, TIME AND SETTING: Multiple sample observation was performed at the Hearing Center, Department of Otolaryngology, Head and Neck Surgery, Zhujiang Hospital of Southern Medical University from April to June 2008. MATERIALS: Homophonic monosyllabic word tone recognition was selected as speech material from Guidance for Evaluation of Deaf Children Auditory Speech Rehabilitation, including 10 syllables and 4 tones, total of 40 words. METHODS: Time-domain information on single word in Standard Chinese were analyzed with Cool Edit Pro 2.0 software. The frequency domain analysis of Mandarin monosyllabic word phoneme was performed with Fast Fourier transform for amplitude spectra analysis; and digital filtering of Finite Impulse Response was in high-pass filtered at 0.5 kHz, band-pass from 0.5 kHz to 4.0 kHz, low-pass at 4 kHz and 2 kHz, respectively. MAIN OUTCOME MEASURES: Envelope curve and sound length as well as frequency domain of each monosyllabic word phoneme. RESULTS: The time-domain analysis of waveform and temporal envelope showed that regardless of consonant or vowel, fundamental-frequency envelope was highly similar as long as the tone was identical in different monosyllabic words. It was demOnstrated that the time-domain information played a vital role in Chinese tone recognition. The frequency domain analysis indicated that monosyllabic words were mainly composed of F0, F1, F2 and F3, in which F0 was the fundamental frequency of the speech signals, F1 and F2 were the second harmonic and third harmonic, F3 was the high-frequency components of the speech signals and played an important role in the speech clarity. CONCLUSION: The time-domain and frequency-domain information play an important role in Chinese speech tone recognition.
出处
《中国组织工程研究与临床康复》
CAS
CSCD
北大核心
2008年第35期6827-6830,共4页
Journal of Clinical Rehabilitative Tissue Engineering Research