期刊文献+

基于能量变化率的汉语塞音检测算法 被引量:1

Chinese Stop Detection Based on Energy Change Rate
下载PDF
导出
摘要 针对爆发谱特征不稳定的问题,论文提出了一种基于能量变化率的汉语塞音检测方法。该方法首先基于Seneff听觉谱提取了一组描述音段能量变化率特性的参数,然后采用Fisherface方法进行特征变换,变换后的特征采用K近邻(KNN)分类器进行分类,实现了塞音的检测,最后利用留一法对模型性能进行交叉验证。实验结果表明,干净语音塞音检测准确率可以达到96.39%,信噪比10dB的语音塞音检测准确率可达到88.07%,模型具有较好的稳定性和泛化性能。 In order to solve the issue of unreliable burst spectrum feature, a Chinese stop detection method based on energy change rate characteristic is proposed. The energy change rate features are first acquired from the Seneff's au- ditory spectrum, and then transformed by Fisherface approach. Finally the KNN classifier is implemented to realize stop detection. Tested by leave-one-out cross validation, the results indicate a good performance of high stability and generalization: the accuracy is 96.39% for clean speech and 88.07% for noisy speech with the SNR of 10dB.
出处 《中文信息学报》 CSCD 北大核心 2014年第3期116-122,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金(61175017)
关键词 塞音检测 能量变化率 发音特性 Seneff听觉模型 stop detection energy change rate articulatory characteristic Seneff auditory model
  • 相关文献

参考文献18

  • 1Chin-Hui. Lee, From knowledge-ignorant to knowl- edge-rich modeling: A new speech research paradigm for next generation automatic speech recognition[C]// Proceedings of ICSLP Keynote Speech, 2004: 1137- 1140.
  • 2Jurgen T Geiger, Mohamed Anouar Lakhal, Bjorn Schuller, Gerhard Rigoll. Learning new acoustic e- vents in an HMM-based system using MAP adaptation [C]//Proceedings of INTERSPEECH, 2011 : 293-296.
  • 3David Mejia-Navarrete, Ascensian Gallardo-Antolln, Carmen Pelgez-Moreno. Feature Extraction Assess- ment for an Acoustic-Event ClassificationTask Using the Entropy Triangle [C]//Proceedings of INTER- SPEECH, 2011 :309-312.
  • 4张宝奇,张连海,屈丹.基于听觉事件检测的汉语语音声韵切分[J].声学学报,2010,35(6):701-707. 被引量:7
  • 5Almpanidis G, Kotti M, Kotropoulos, and C., Ro- bust Detection of Phone Boundaries Using Model Se- lection Criteria With Few Observations [ J]. IEEE Transactions on Audio, Speech, and Language Pro- cessing, 2009,17(2) :287-298.
  • 6陈斌,张连海,王波,屈丹.基于Seneff听觉谱特征的汉语连续语音声韵母边界检测[J].声学学报,2012,37(1):104-112. 被引量:6
  • 7M F Dorman. Relative spectral change and formant transitions as cues to labial and alveolar place of articu- lation[J]. J. Acoust. Soc. Am. 1996,100(6):3825- 3830.
  • 8A R Jayan and P C Pandey, Detection of stop land- marks using gaussian mixture model of speech spec- trum[C]//Proceedings of ICASSP, 2009:4681-4684.
  • 9Chi-Yueh Lin, Hsiao-Chuan Wang. Using Burst Onset Information To Improve Stop/Affricate Phone Recog- nition[C]//Proceedings of ICASSP[C], 2010: 4,862- 4865.
  • 10Prem C Pandey, Milind S Shah, Estimation of Place of Articulation During Stop Closures of Vowel Conso- nant Vowel Utterances, IEEE Transactions on Audi- o, Speech, and Language Processing, 2009,17 (2) : 277-286.

二级参考文献35

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2李朝晖,迟惠生.听觉外周计算模型研究进展[J].声学学报,2006,31(5):449-465. 被引量:22
  • 3Lee Chin-Hui. From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next gen- eration automatic speech recognition. In: Proc. Of ICSLP Keynote speech, Jeju Island, Korea, 2004:213 216.
  • 4Toledano D T, Gomez L A H, Grande L V. Automatic phonetic segmentation. IEEE Transactions on A U- DIO SPEECH and LA NG UA GE Processing, 2005; 11 (6): 617-625.
  • 5Malfrere F, Dutiot T. High-quality speech synthesis for phonetic speech segmentation. In: Proc. Eurospeech'97, Rhodes, Greece, 1997:2631-2634.
  • 6Kuo J W, Wang H M. Minimum boundary error training for automatic phonetic segmentation. In: Proc. Of Interspeech, Pittsburgh, USA. 2006:1497-1500.
  • 7Nuo J W, Lo H Y, Wang H M. Improved HMM/SVM methods for automatic phoneme segmentation. In: Proc. of Interspeech, Antwerp, Belgium, 2007(2): 2057-2060.
  • 8Lo H Y, Wang H M. Phonetic boundary refinement using neural network . In: Proc. of ICASSP, Istanbul, Turkey, 2007:3438-3441.
  • 9van Santen J, Sproat R. High accuracy automatic segmentation. In: Proc. Eurospeech'99, Budapest, Hungary, 1999:2809-2812.
  • 10Sorin Dusan, Lawrence Rabiner. On the relation between maximum spectral transition positions and phone boundaries. In: Proc. of Interspeech, Pittsburgh, USA, 2006(1): 1317-1320.

共引文献14

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部