期刊文献+

汉语耳语音库的建立与听觉实验研究 被引量:13

The Establishment of a Chinese Whisper Database and Perceptual Experiment
下载PDF
导出
摘要  耳语音的识别和转换是个全新的课题,可应用于公共场合下的通讯和公安司法工作的某些特殊需要等方面.首先建立了一个包含1172个字和98个近音词的单人女声的汉语耳语音库.通过对两个听觉测试实验数据的统计分析,研究了人耳对汉语耳语音字和近音词声调的辨认率特征,得出人耳对孤立字四个声调的辨认率由高到低的排序为三声>四声>二声>一声.同时也得出人耳对词声调的辨认能力比字要强得多.幅值包络和音长这两个特征参量能够反映出汉语耳语音声调的特性,基于此参数对汉语耳语音字进行声调识别实验,其声调识别率已达到了人耳的平均辨认率,为连续耳语音声调识别研究打下了基础. <Abstrcat>Whispering is a special way of speaking to communicate message lowly or privately. The whispered speech recognition and the reconstruction of normal speech from whisper are needed for some specific purposes, such as the private speech communication by mobile phone in public or the speech processing for police or of military use. However, few research has been conducted in these fields and many problems remain unsolved. In this paper, a Chinese whisper database in preparation for future work on whispered speech processing is introduced. The database consists of 1 172 characters and 98 closed-tone words from a female. Based on this database two auditory perceptual tests were conducted to investigate the tonal identification features of single characters and closed-tone words in Chinese whispered speech. From the experimental results the following conclusions can be drawn: (1) Tone 3 has the highest human perception accuracy, followed by tone 4 and tone 2, and tone 1 is the hardest to identify. (2) Women have better auditory perception than men as far as whispered speech is concerned. (3) The accuracy of identifying the tones is decided largely by the duration of sound. For tone 2 and tone 3, the longer the duration is, the more accurate the identification will be. On the contrary, for the fourth tone, the shorter the duration is, the more accurate the identification will be. (4) For human ears, it is easier to distinguish Chinese words of similar pronunciation with variation of the first character than to distinguish those with variation of the last character. (5) The human perception of words in whispered speech is more efficient than that of single characters. Because there is no fundamental frequency in the whispered speech, other features should be found to represent the tones. It is conjectured that the amplitude contour and duration can be used as the tone feature parameters in the Chinese whispered tone recognition. Another experiment is designed to validate this. The results show that the average tone recognition rates of characters by computer are at the same level as that of human perception. This forms the foundation for future tone study of continuous whispered speech.
出处 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2005年第3期311-317,共7页 Journal of Nanjing University(Natural Science)
基金 国家自然科学基金项目(60272037 60340420325)
关键词 耳语音 声调 辨认率 幅值包络 音长 声调识别 whisper ,tone, human perception, amplitude contour,duration,tone recognition
  • 相关文献

参考文献5

二级参考文献27

  • 1陈韬,李昌立,莫福源.汉语孤立字全音节实时识别系统[J].声学学报,1993,18(3):161-171. 被引量:4
  • 2潘凌云,孙达传,吴美朝.语音识别中基于语谱图的语音音素分割方法[J].杭州大学学报(自然科学版),1995,22(1):42-46. 被引量:7
  • 3张铃,吴福朝,张钹,韩玫.多层前馈神经网络的学习和综合算法[J].软件学报,1995,6(7):440-448. 被引量:33
  • 4齐士钤 张家禄.汉语普通话辅音音长分析[J].声学学报,1982,(1):8-13.
  • 5曹剑芬.现代语音基础知识[M].北京:人民教育出版社,1990..
  • 6徐秉铮 张百灵 等.神经网络理论与应用[M].广州:华南理工大学出版社,1995..
  • 7Taisuke Itoh, Kazuya Takeda, Fumitada Itakura. Acoustic Analysis and Recognition of Whispered Speech[J]. ICASSP,2002: 389-392.
  • 8Robert W. Morris, Mark A. Clements. Reconstruction of Speech from Whispers [J]. Medical Engineering & Physics, 200'2,24: 515-520.
  • 9Qian-Jie Fu,Fan-Gang Zeng. Identification of Temporal Envelope Cues in Chinese Tone Recognition [J]. Asia Pacific Journal of Speech, Language and Hearing,2000,(5) :45-57.
  • 10Man Gao. Tones in Whispered Chinese:Articulatory and PerceptualCues. [Master], University of Victoria,2002.

共引文献57

同被引文献134

引证文献13

二级引证文献66

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部