期刊文献+

基于能量谱熵的英语摩擦音检测方法

An English Fricative Detection Method Based on Energy Spectrum Entropy
下载PDF
导出
摘要 根据摩擦音发声时的频谱特点,提出一种基于能量谱熵的摩擦音检测方法.该方法首先利用不同音素的语谱能量特点检测出音素边界.然后计算每个语音段的能量谱熵,并将超过阈值的语音段作为候选.最后根据语音段的长度、开始结束时的能量突变等对特征候选语音段后处理,去除错误候选.实验表明,在干净环境中并且容错误差为20 ms时,摩擦音的检测率达到96.9%. According to the spectrum characteristics of fricatives, a fricative detection method based on the energy spectrum entropy is proposed. Firstly, phone boundaries are detected based on spectrum of different phonemes. Then, each spectrum entropy of speech segments is computed and the segments whose entropy exceeds the threshold are selected as candidates. Finally, post processing is conducted to remove the insertion errors according to parameters of segment length and the sudden changing of energy at segment starts and ends. The experimental results show that the accuracy of the proposed method is up to 96.9% in clean circumstance when the tolerance is 20 ms.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2014年第6期554-560,共7页 Pattern Recognition and Artificial Intelligence
关键词 能量谱熵 摩擦音检测 音素边界检测 Energy Spectrum Entropy, Fricative Detection, Phone Boundary Detection
  • 相关文献

参考文献11

  • 1Lee C. From Knowledge-Ignorant to Knowledge-Rich Modeling : A New Speech Research Paradigm for Next Generation AutomaticSpeech Recognition[EB/OL]. [2012-08-30] . http://slam. iis.sinica. edu. tw/NGASR/workshop/20041127-asat. pdf.
  • 2Dusan S, Rabiner L R. On Integrating Insights from Human Speech Perception into Automatic Speech Recognition [ EB/OL]. [2012-09-01 ] . http://cronos. rutgers. edu/ ~ lrr/lrr%20papers/352_dr_euro2005c. pdf.
  • 3Lee C H . An Overview on Automatic Speech Attribute Transcription(ASAT) // Proc of the 8 th Annual Conference of the InternationalSpeech Communication Association. Antwerp, Belgium, 2007 :1825-1828.
  • 4Stevens K N. Toward a Model for Lexical Access Based on AcousticLandmarks and Distinctive Features.Journal of the Acoustical Socie-ty of America, 2002, 111(4) : 1872-1891.
  • 5Liu S A. Landmark Detection for Distinctive Feature-Based SpeechRecognition. Journal of the Acoustical Society of America, 1996,100(5): 3417-3430.
  • 6Park C. Consonant Landmark Detection for Speech Recognition.Ph. D Dissertation. Massachusetts, USA: Massachusetts Instituteof Technology,2008.
  • 7陈斌,张连海,牛铜,王波.基于能量分布和共振峰结构的汉语鼻音检测[J].中文信息学报,2012,26(1):104-109. 被引量:1
  • 8Wang Y. A Two-Stage Sample-Based Phone Boundary Detector Using Segmental Similarity Features // Proc of the 12th AnnualConference of the International Speech Communication Association.Florence, Italy, 2011 : 413-416.
  • 9Quatieri T F. Discrete-Time Speech Signal Processing: Principlesand Practice. Upper Saddle River, USA: Prentice Hall, 2001.
  • 10李朝晖,迟惠生.听觉外周计算模型研究进展[J].声学学报,2006,31(5):449-465. 被引量:22

二级参考文献161

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2Chin-Hui. Lee. From knowledge-ignorant to knowl- edge-rich modeling: A new speech research paradigm for next generation automatic speech recognition[C]// Proceedings of ICSLP Keynote speech, 2004.
  • 3S. R. Mahadeva Prasanna, B.V. Sandeep Reddy, P. Krishnamoorthy. Vowel onset point detection using source, spectral peaks and modulation spectrum ener- gies[J]. IEEE Transactions on Audio, Speech and Language Processing, 2009,17 (4): 556-565.
  • 4Almpanidis G. , Kotti M. , Kotropoulos C.. Robust Detection of Phone Boundaries Using Model Selection Criteria With Few Observations [J]. IEEE Transac- tions on Audio, Speech, and Language Processing, 2009,17(2) .. 287-298.
  • 5K.Y. Leung, M. Siu. Speech Recognition Using Combined Acoustic and Articulatory Information with Retraining of Acoustic Model Parameters[C]//Pro- ceedings of ICSLP 2002,3: 2117-2120.
  • 6M. Hasegawa-Johnson, J. Baker, S. Borys, et. al. Landmark-based speech recognition: Report of the 2004 Johns Hopkins summer workshop[C]//Proeeedings of ICASSP,2005 : 213-216.
  • 7J. Morris, E. Fosler-Lussier. Further experiments with detector-based conditional random fields in pho- netic recognition[C]//Proeeedings of ICASSP, April, 2007.
  • 8Carla Lopes, Fernando Perdigao. A HierarchicalBroad-class Classification to Enhance Phoneme Recog- nition[C]//Proceedings of European Signal Processing Conference, 2009,1760-1764.
  • 9Limin Du, Kenneth Noble Stevens. Automatic Detec- tion of Landmark for Nasal Consonants from Speech Waveform[C]//Proceedings of ICSLP 2006.
  • 10Sarah E. Borys. An SVM Front-end Landmark Speech Recognition System[M]. University of Illinois, 2008.

共引文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部