期刊文献+

连续汉语语音的自动切分研究

Research on Automatic Segmentation of Continuous Chinese Speech
下载PDF
导出
摘要 连续汉语语音的自动切分是语音识别的基础,准确的连续语音切分方法可以代替人工标记汉字音节。传统的连续汉语语音自动切分技术如双门限端点检测、基于倒谱的端点检测等方法的效果都难以满足语音识别的需要。论文在时间域、频域及倒谱域等多个层次对连续语音信号进行分析,结合端点检测技术、频谱分析和倒等方法对音节切分点进行检测,研究了一种连续语音多级切分方法。相比传统的基于双门限和倒谱的端点检测方法,该方法将单字切分的正确率达到了92.8%。 The automatic segmentation of continuous speech is the basis of speech recognition.An accurate continuous speech segmentation method can replace manual marking of Chinese syllables.Traditional continuous Chinese speech automatic segmentation techniques such as dual-threshold endpoint detection,cepstrum-based endpoint detection and other methods are difficult to meet the needs of speech recognition.This paper analyzes the continuous speech signal at multiple levels in the time domain,frequency domain,and cepstrum domain,and combines endpoint detection technology,spectrum analysis and cepstrum analysis to detect the segmentation points,and a multi-level segmentation method for continuous Chinese speech is studied.Compared with the traditional endpoint detection method based on VAD and cepstrum,the method improves the accuracy to 92.8%.
作者 李琦 张二华 LI Qi;ZHANG Erhua(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)
出处 《计算机与数字工程》 2023年第4期959-964,共6页 Computer & Digital Engineering
基金 军委装备发展部十三五装备预研领域基金项目(编号:61403120102)资助。
关键词 语音切分 端点检测 语谱图 双门限法 频带能量 speech segmentation endpoint detection spectrogram voice activity detection(VAD) energy of frequency range
  • 相关文献

参考文献8

二级参考文献64

  • 1林帆,徐明星.一种改进的基于时域参数的语音切分算法[J].计算机科学,2006,33(4):164-167. 被引量:3
  • 2梁奇,郑方,徐明星,吴文虎.基于trigram语体特征分类的语言模型自适应方法[J].中文信息学报,2006,20(4):68-74. 被引量:6
  • 3张辉,杜利民.汉语连续语音识别中不同基元声学模型的复合[J].电子与信息学报,2006,28(11):2045-2049. 被引量:7
  • 4廖文辉,刘炎.数据分析与SAS实验[M].北京:经济科学出版社,2010.
  • 5Tryfou G, Pellin M, Omologo M. Time-Frequency Reas- signed Cepstral Coefficients for Phone-Level Speech Seg- mentation [ C ]. 2014 Proceedings of the 22nd European Signal Processing Conference. 2014:2060-2064.
  • 6Stolcke A, Ryant N, Mitra V, YUAN Jia-hong. Highly Accurate Phonetic Segmentation Using Boundary Correc- tion Models and System Fusion[ C ]. 2014 IEEE Interna- tional Conference on Acoustics, Speech and Signal Processing. 2014:5552-5556.
  • 7吕伟辰,洪青阳,王胜等.基于Viterbi-GMM的文本提示型说话人识别系统[C].第十二届全国人机语音通讯学术会议,2013.
  • 8Iosif Mporas, Alexandros Lazaridis, Todor Ganchev, Ni- kos Fakotakis. Using Hybrid HMM-based Speech Seg- mentation to Improve Synthetic Speech Quality [ C ]. In Proceedings of the 13th Pan-Hellenic Conference on Informatics, PCI 2009 : 118-122.
  • 9Sainath, Tara N, Kanevsky, Dimitri, et, al. Broad Pho- netic Class Recognition in a Hidden Markov Model Frame Work Using Extended Baum Welch Transformations [ C ]. 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, 2007 :pp. 305-311.
  • 10TOLEDANO D T, GOMEZ L A H, GRANDE L V. Automatic phonetic segmentation[J]. IEEE Transactions on Speech and Audio Processing, 2003, 11(6):617-625.

共引文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部