鼻辅音感知线索研究

Identification of Nasal Consonant Perceptual Cue

下载PDF

导出

摘要语音识别系统的性能受许多因素的影响,如不同的说话人、说话方式、环境噪音等。为了提高系统的识别率和稳定性,一种重要的解决方法是寻找更好的、高强健性的基于人耳听觉感知特性的感知线索。基于此,三维深度研究方法（3DDS）被发明,用来探究语音信号在人耳内部的感知线索,并已成功的运用于对摩擦音和爆破音的感知线索识别。本文将这种方法拓展到鼻辅音的感知线索研究。在三个感知实验结果分析的基础上,定义了冗余感知线索和次要感知线索,并找到了/m/的感知线索是大约位于363~1250 Hz的语音部分,/n/的感知线索是大约位于939~2826 Hz的语音部分。 The performance of a speech recognition system is affected by many factors, such as different speakers, speaking style, ambient noise etc. In order to improve the system＇ s ability to be more accurate and robust despite these factors, one important solution is to look for some better and more robust representations of the acoustic signal based on the principle of human perceptional feature. The human internal acoustic representation has previously been investigated by using the 3-Di- mensional Deep Search （3DDS） method. This method has proven successful in finding perceptual cue of plosive and frica- tive consonants in natural speech. In this paper, the method is extended to predict the perceptual cues for the nasal conso- nants/m, n/. Based on analysis of the results from three experiments, the redundant cue and secondary perceptual cue are defined. The perceptual cue of/m/is speech component lying around from 363 ~ 1250 Hz and the perceptual cue of/m/is speech component lying around from 939 ~ 2826 Hz.

作者白帆 Jont B.Allen

机构地区电子科技大学电子工程学院伊利诺伊大学厄巴拿香槟分校

出处《信号处理》 CSCD 北大核心 2015年第6期727-736,共10页 Journal of Signal Processing

基金美国National Institute of Health(Grant No.R21-RDC009277A)

关键词鼻辅音识别感知线索噪音掩蔽 nasal recognition perceptual cue noise masking

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献30

1Fletcher H, Galt R. Perception of speech and its relation to telephony [ J ]. Journal of the Acoustical Society of America, 1950,22 ( 2 ) :89-151.
2Cooper F, Delattre P, Liberman A, et al. Some experiments on the perception of synthetic speech sounds [ J ]. Journal of the Acoustical Society of America, 1952,24:597-606.
3Allen J B. How do humans process and recognize speech [J]. IEEE Trans. Speech Audio Process,1994,2:567-577.
4Allen J B. Consonant recognition and the articulation in- dex[ J ]. Journal of the Acoustical Society of America, 2005,117 (4) : 2212 -2223.
5Shannon R V, Zeng F. G, Kamath V, et al. Speech rec- ognition with primarily temporal cues [ J ]. Science, 1995, 270(5234) :303-304.
6Hughes G W, Halle M. Spectral properties of fricative consonants [ J ]. Journal of the Acoustical Society of America, 1956,28 ( 2 ) : 303-310.
7Heinz J, Stevens K. On the properties of voiceless frica- tive consonants [ J ]. Journal of the Acoustical Society of America, 1961,33 (5) :589-596.
8Recasens D. Place cues for nasal consonants with special reference to Catalan [ J]. Acoust. Soc. Am. 1983 : 73, 1346-1353.
9Stevens K N, Blumstein S E. Invariant cues for place of articulation in stop consonants [ J]. Journal of the Acous- tical Society of America, 1978 Nov,64(5 ) : 1358-1368.
10Delattre P, Liberman A, Cooper F. Acoustic loci and trans- lational cues for consonants [ J ]. Journal of the Acoustical Society of America,1955,27(4) :769-773.

1王颖.移动多媒体广播CMMB信道编码系统的强健性[J].中国科技博览,2009(13):102-102.
2韩泽.消除手机中的耳机爆破音和TDMA噪声[J].电子产品世界,2008,15(8):92-93.
3石武信,戚年成,刘强.音频制作中的一种特殊噪音[J].电子科技,1995,8(3):50-54.
4李晋,王玲.一种改进的孤立词端点检测方法[J].计算机工程与应用,2006,42(30):69-71. 被引量：3
5鲁茅茅.基于H．264的视频转码在数字移动电视中的应用[J].世界广播电视,2006,20(8):70-74.
6王洪,陈祝明,郭睿刚.基于软件无线电平台的频谱感知实验设计[J].实验科学与技术,2016,14(5):16-20. 被引量：1
7田原嫄,谭庆昌.基于静态图像的CCD激光测距方法的研究[J].微计算机信息,2007,23(31):96-97. 被引量：6
8英飞凌最新推出的“带线圈的模块”芯片封装技术简化强健的双界面银行卡和信用卡的生产过程[J].电子设计工程,2013,21(5):74-74.
9王华伟.等待天籁——谈语音技术的发展趋势[J].中国经济和信息化,1998(52):31-31.
10安森美半导体推出下一代120万像素CMOS图像传感器[J].中国集成电路,2016,0(3):8-8.

信号处理

2015年第6期

浏览历史

内容加载中请稍等...

鼻辅音感知线索研究

参考文献30

相关作者

相关机构

相关主题

浏览历史