期刊文献+

基于区分性特征的音素识别

Phoneme Recognition Based on Distinctive Features
下载PDF
导出
摘要 针对当前主流的基于统计模型的语音识别系统没有使用语音产生知识的问题,通过模拟人类的语音感知理解过程提出了一种"自下而上"的基于区分性特征的音素识别方法。该方法首先根据不同音素的发音特点检测得到音素的边界信息;然后利用分类器完成语音的区分性特征检测,并根据区分性特征与音素的对应关系建立映射表;最后利用音素的边界信息得到语音段的特征序列,通过对语音段的特征序列模糊搜索匹配实现音素识别。实验结果表明,相比于传统的基于隐马尔科夫模型的音素识别方法,该方法在识别速度、鲁棒性及可扩展性等方面具有明显优势。 To address the problem that current popular speech recognition systems based on statisti- cal models do not use Speech production knowledge, a "bottom-up" phone recognition method is proposed based on the distinctive features by simulating the process of human speech recognition. Firstly, the phone boundaries are detected according to the characters of different phonemes; Sec- ondly, the distinctive features are extracted by classifiers, and the mapping table of feature-to-pho- neme is built depending on the distinctive features; Finally, the feature sequences of segments are obtained using phoneme boundaries, and by fuzzy searching and matching through segment features, phoneme recognition is completed. Experimental results show that, compared to the phoneme recog- nition methods based on Hidden Markov Model, this method has prominent advantages in terms of recognition speed, robustness, expansibility etc.
机构地区 信息工程大学
出处 《信息工程大学学报》 2013年第6期692-699,共8页 Journal of Information Engineering University
基金 国家自然科学基金资助项目(61175017)
关键词 语音产生知识 音素边界检测 区分性特征 音素识别 模糊匹配 speech production knowledge phone boundaries detection distinctive feature pho- neme recognition fuzzy matching
  • 相关文献

参考文献1

二级参考文献12

  • 1Dusan S, Rabiner L R. On integrating insights from human speeeh pereeption into automatic speech rec- ognitionl-C]//Conference on the International Speech Communication Association (InterSpeech). Lisbon: Interspeeeh Press, 2005 : 1233-1236.
  • 2Morris J, Fosler Lussier E. Combining phonetic at tributes using conditional random fields[C]/Proc An nu Conf Int Speech Commun Assoc, INTER SPEECH. UK: Dummy Pubid, 2006:597-600.
  • 3Scharenborg O, Wan V, Mirjam E. Unsupervised speech segmentation: an analysis of the hypothesized phone boundaries[J]. Journal of the Acoustical Soci- ety of America, 2010,127(2) :1084-1095.
  • 4Yu Qiao, Shimomura N, Minematsu N. Unsuper- vised optimal phoneme segmentation., objectives, al- gorithm and comparisons [C]//IEEE International Conference on Acoustics, Speech and Signal Process- ing. Las Vegas, USA: Es. n. ], 2008:3989-3992.
  • 5Dusan S, Rabiner L. On the relation between maxi- mum spectral transition position and phone bounda- riesI-C3//International Speech Communication Asso-ciation. Pittsburgh, USA[s. n. ], 2006:1317-1320.
  • 6Omar M, Hasegawa-Johnson M, Levinson S. Gaussian mixture models Of phonetic boundaries for speech recognition[J]. IEEE Workshop on Automat- ic Speech Recognition and Understanding, 2002: 33- 36.
  • 7Chen I F, Wang Hsin-Min. Articulatory feature asynchrony analysis and compensation in detection- based ASR[-C]//International Speech Communication Association. Brighton, United Kingdom: [s. n. ], 2009:3059-3062.
  • 8Zoltdn Ttiske, Christian Plahl. A study on speaker normalized MLP features in LVCSR[C]//Conference of the International Speech Communication Associa- tion. Florence, Italy: [s. n. ], 2011 : 1089-1092.
  • 9Strom N. The NICO artificial neural network toolkit [EB/OL]. (2011-02-10). http://nico, nikkostrom. com.
  • 10Viet-Bac Le, Lori Lamel, Jean-Luc Gauvain. Multi- style MLP features for BN transeription[C]//|EEE International Conference on Acoustics Speech and Signal Processing. Dallas TX: [s. n.-], 2010:4866- 4869.

共引文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部