汉语数字耳语音识别研究被引量：2

Speech Recognition of Chinese Whispered Speech

下载PDF

导出

摘要耳语音识别可应用于国家安全的某些特殊需要。运用双门限法对语音样本进行端点检测,通过实验分别找出短时能量、短时过零率的高低门限4个参数的最佳取值。深入分析研究参数的抗噪问题,在MFCC参数中引入短时能量、一阶差分、二阶差分等参数,增强MFCC的抗噪性。研究表明,在隐马尔可夫模型中,MFCC和LPCC联合运用讨论识别效果要远优于独立参数。 The whispered speech recognition even can be applied in the field of national security. In this paper,the characteristics of whispered speech in physiology and acoustics are introduced. The whispered speech is a noise sound source,the resonance peaks are offset,to recognize it more difficult than normal speech. The dual- threshold method of endpoint detection of voice samples is used,respectively,through experiments to identify the best value of the four parameters of short- time energy,short- time zero- crossing rate threshold. Depth analysis of the parameters of anti- noise problem; the introduction of short- time energy,first- order differential,second- order differential parameters and any other parameters in MFCC is made to enhance the anti- noise ability. The effect on recognition of joint use MFCC is much better than that and LPCC in HMM.

作者邓秀慧

机构地区南京工程学院计算机工程学院河海大学计算机与信息学院

出处《电声技术》 2014年第7期47-50,共4页 Audio Engineering

基金国家自然科学基金项目(51101086)

关键词语音识别耳语音识别研究 speech recognition whispered speech recognition research

分类号 O429 [理学—声学]

引文网络
相关文献

参考文献12

1TRASK R L. A dictionary of phonetics and phonology [ M].London: Taylor & Francis, 1996.
2SCHWARTZ M F,RINE H E. Identification of speaker sexfrom isolated,whispered vowels [ J] . Acoustical Society ofAmerican, 1968,44(6) : 1736 -1737.
3TARTTER V C. Identifiability of vowels and speakers fromwhispered syllables [ J ]. Perception and Psychophysics,1991,49(4): 365 -372.
4沙丹青,栗学丽,徐柏龄.耳语音声调特征的研究[J].电声技术,2003,27(11):4-7. 被引量：21
5HIGASHIKAWA M, NAKAI K, SAKAKURA A, et al. Per-ceived pitch of whispered vowels-relationship with formantfrequencies: a preliminary study [ J ]. Journal of Voice,1996,10(2) : 155 -158.
6樊星,卢晶,徐柏龄.汉语耳语音转换为正常音的研究[J].电声技术,2005,29(12):44-47. 被引量：11
7JOVICIC S T. Formant feature differences between whisperedand voiced sustained vowels [ J ]. Acustica - acta acustica,1998,84 (4): 739 -743.
8ITOH T, TAKEDA K,ITAKURA F. Acoustic analysis andrecognition of whispered speech[ C]//Proc. ICASSP. Orlan-do, USA: [s. n. ] ,2002 : 389 -392.
9SCHWARTZ M F* Syllable duration in oral and whisperedreading[ J] . Acoustical Society of American, 1967,41 (5);1367 -1369.
10KALLAIL K J, EMANUEL F W. Formant - frequencydifferences between isolated whispered and phonated ofvowel samples produced by adult female subjects [ J ].Speech and Hearing Research, 1984 , 27(2) : 245 -251.

二级参考文献13

1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量：34
2刘莹,李国锋.用线性预测法实现气声语音的重建[J].电声技术,1995,19(9):2-4. 被引量：2
3Morris R W,Clements M A.Reconstruction of Speech From Whispers.Medical Engineering & Physics,2002,24(8) :515-520.
4杨顺安.浊音源动态特性对合成音质的影响[J].中国语文,1986,3:173-181.
5.[EB/OL].http://www.enounce.com,.
6Taisuke Itoh, Kazuya Takeda, Fumitada Itakura. Acoustic Analysis and Recognition of Whispered Speech[J]. ICASSP,2002: 389-392.
7Robert W. Morris, Mark A. Clements. Reconstruction of Speech from Whispers [J]. Medical Engineering & Physics, 200'2,24: 515-520.
8Qian-Jie Fu,Fan-Gang Zeng. Identification of Temporal Envelope Cues in Chinese Tone Recognition [J]. Asia Pacific Journal of Speech, Language and Hearing,2000,(5) :45-57.
9Man Gao. Tones in Whispered Chinese:Articulatory and PerceptualCues. [Master], University of Victoria,2002.
10W Meyer Eppler. Realization of Prosodic Features in Whispered Speech [J]. Journal of Acoustical Society of America, 1957, 29( 1 ) : 104-106.

共引文献23

1李晗菲,冯燕,孟亚茹,彭刚.能量包络和音长对普通话声调感知的影响[J].中国语音学报,2019(1):49-59. 被引量：1
2LIXueli,XUBoling.Tone features in whispered Chinese[J].Progress in Natural Science:Materials International,2005,15(3):285-288. 被引量：5
3杨莉莉,李燕,徐柏龄.汉语耳语音库的建立与听觉实验研究[J].南京大学学报（自然科学版）,2005,41(3):311-317. 被引量：13
4宋益丹.汉语声调实验研究回望[J].语文研究,2006(1):41-45. 被引量：17
5樊星,卢晶,徐柏龄.汉语耳语音转换为正常音的研究[J].电声技术,2005,29(12):44-47. 被引量：11
6杨莉莉,林玮,徐柏龄.汉语耳语音孤立字识别研究[J].应用声学,2006,25(3):187-192. 被引量：8
7荣薇,陶智,顾济华,赵鹤鸣.基于改进LPCC和MFCC的汉语耳语音识别[J].计算机工程与应用,2007,43(30):213-216. 被引量：17
8荣薇,陶智,顾济华,赵鹤鸣.基于概率神经网络的汉语耳语音识别系统[J].计算机工程与应用,2008,44(17):148-150. 被引量：3
9赵艳,赵力,邹采荣.耳语音的语音处理研究综述[J].声学技术,2008,27(4):562-569. 被引量：4
10韩韬,陶智,顾济华,赵鹤鸣,李玲.基于BP神经网络的耳语音转换为正常语音的研究[J].通信技术,2009,42(2):152-155. 被引量：3

同被引文献14

1陶智,赵鹤鸣,龚呈卉.基于听觉掩蔽效应和Bark子波变换的语音增强[J].声学学报,2005,30(4):367-372. 被引量：39
2杨莉莉,林玮,徐柏龄.汉语耳语音孤立字识别研究[J].应用声学,2006,25(3):187-192. 被引量：8
3杨阳,陈永明.声纹识别技术及其应用[J].电声技术,2007,31(2):45-46. 被引量：22
4HIGASHIKAWA M, NAKAI K, SAKAKURA A, et al.Perceived pitch of whispered vowels-relationship with for-mant frequencies : a preliminary study [ J ]. Joumal ofVoice, 1996, 10(2) : 155 -158.
5LI X L, XU B L. Formant comparison between Mandarinwhispered and voiced vowels [ J ]. Acta Acustica united withAcustica, 2005, 91(6) :1079 -1085.
6ITOH T,TAKEDA K, ITAKURA F. Acoustic analysis andrecognition of whispered speech [ C ]//Proc. ICASSP. Or-lando ,Florida,USA : [ s. n. ] ,2002 ; 389 - 392.
7SCIIWARTZ M F. Syllable duration in oral and whisperedreading [ J ]. Joumal of Acoustical Society of American,1967,41(5) : 1367 -1369.
8KALLAIL K J,EMANUEL F W. Formant - frequency differ-ences between isolated whispered and phonated of vowel sam-ples produced by adult female subjects [ J ]. Joumal of Speechand Hearing Research, 1984,27(2) : 245 -251.
9尹辉,茹婷婷,谢湘.汉语耳语音数字串识别研究[C]//第九届全国人机语音通讯学术会议论文集,2007:122~ 127.
10LIANGChunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong.Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition[J].Chinese Journal of Acoustics,2012,31(4):489-498. 被引量：2

引证文献2

1邓秀慧.耳语音元音共振峰研究[J].电声技术,2015,39(12):53-56.
2倪纪伟,彭妙颜.基于Fisher比的Bark倒谱系数混合特征参数提取方法[J].电声技术,2019,43(1):30-33. 被引量：3

二级引证文献3

1陈旭,蒋晔.基于高斯滤波器组混合特征的录音回放攻击检测研究[J].计算机工程,2021,47(3):291-297. 被引量：2
2段儒杰,行鸿彦,陈子正,刘洋.基于被动音频的低小慢目标探测方法[J].电子测量与仪器学报,2021,35(10):41-47. 被引量：5
3樊庆玲,杨宏波,郭涛,张伟,王威廉.FrFT-Bark域特征提取与CNN残差收缩网络心音分类算法[J].云南大学学报（自然科学版）,2023,45(3):564-574. 被引量：1

1李野,姬红旭,张磊,张晓雪.端点检测方法的研究[J].职业技术,2014,0(10):206-206.
2邓秀慧.耳语音元音共振峰研究[J].电声技术,2015,39(12):53-56.
3黎祥君.证券组合的时间序列模型[J].温州师范学院学报,2003,24(2):35-39.
4黎祥君.基于时间序列的资本资产定价模型[J].湖州师范学院学报,2006,28(1):25-28. 被引量：1
5肖学雷.线、面模型的电荷“体分布”函数[J].宜春学院学报,2004,26(6):28-29.
6刘泽琛.语音端点检测的常用方法及改进[J].高等函授学报（自然科学版）,2008,21(3):52-53. 被引量：4
7白国仲.基于特殊需要的运输问题[J].数学的实践与认识,2008,38(21):143-149. 被引量：2
8白国仲,陈雯,苏芳荔,王学东.基于特殊需要的指派问题[J].华中师范大学学报（自然科学版）,2006,40(3):305-309. 被引量：7
9栗学丽,周卫东.ARMA Modelling for Whispered Speech[J].Journal of Measurement Science and Instrumentation,2010,1(3):300-303.
10方均斌.例说“错题”及其教育功能[J].数学通报,2006,45(5):54-56. 被引量：11

电声技术

2014年第7期

浏览历史

内容加载中请稍等...

汉语数字耳语音识别研究被引量：2

参考文献12

二级参考文献13

共引文献23

同被引文献14

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

汉语数字耳语音识别研究 被引量：2

参考文献12

二级参考文献13

共引文献23

同被引文献14

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

汉语数字耳语音识别研究被引量：2