期刊文献+

基于MFCC的频谱重构实现音高估计和发声分类

Pitch Resolution Based on MFCC for Pitch Estimation and Sound Classification
下载PDF
导出
摘要 音高估计和发声分类可以帮助快速检索目标语音,是语音检索中十分重要且困难的研究方向之一,对语音识别领域具有重要的意义。提出了一种新型音高估计和发声分类方法。利用梅尔频率倒谱系数(MFCC)进行频谱重构,并在对数下对重构的频谱进行压缩和过滤。通过高斯混合模型(GMM)对音高频率和滤波频率的联合密度建模来实现音高估计,实验结果在TIMIT数据库上的相对误差为6.62%。基于高斯混合模型的模型也可以完成发声分类任务,经试验测试表明发声分类的准确率超过99%,为音高估计和发声分类提供了一种新的模型。 Pitch estimation and vocal classification can help to quickly retrieve the target speech,which is one of the most important and difficult research directions in speech retrieval,and has important significance in the field of speech recognition.A new method for pitch estimation and vocal classification is proposed.The spectrum reconstruction is performed by using the Mel frequency cepstral coefficient(MFCC),and the reconstructed spectrum is compressed and filtered under logarithm.Pitch estimation was performed by modeling the joint density of pitch frequency and filter frequency using Gaussian mixture model(GMM).The relative error of the experimental results on the TIMIT database was 6.62%.The model based on GMM can also complete the vocal classification task.The experimental results show that the accuracy of vocal classification exceeds 99%,which provides a new model for pitch estimation and vocal classification.
作者 张少华 秦会斌 ZHANG Shao-hua;QIN Hui-bin(Institute of New Electron Device&Application,Hangzhou Dianzi University,Hangzhou 310018,China)
出处 《测控技术》 2019年第11期86-89,131,共5页 Measurement & Control Technology
关键词 语音识别 音高估计 梅尔频率倒谱系数 高斯混合模型 speech recognition pitch estimation Mel frequency cepstral coefficient Gaussian mixture model
  • 相关文献

参考文献2

二级参考文献30

  • 1王维彬,钟润添.一种基于贪心EM算法学习GMM的聚类算法[J].计算机仿真,2007,24(2):65-68. 被引量:15
  • 2Kulkam S G, Chaudhary A K, Nandi S, et al. Modeling and monitoring of batch processes using principal component a- nalysis (PCA) assisted generalized regression neural net- works (GRNN) [ J ]. Biochemical Engineering Journal, 2004, 18(3) :193 -210.
  • 3Narayan A, Marzouk Y,Xiu D B. Sequential data assimilation with multiple models [ J ]. Journal of Computational Physics, 2012,231(19) :6401 -6418.
  • 4Zhu Z B,Song Z H,Palazoglu A,et al. Process pattern con- struction and multi-mode monitoring [ J ]. Journal of Process Control, 2012,22 ( 1 ) : 247 - 262.
  • 5He Q P, Wang J. Fault detection using the k-nearest neighbor rule for Semiconductor manufacturing processes [ J ]. IEEE Transactions on Semiconductor Manufacturing,2007,20(4) : 345 - 354.
  • 6Zhao S J, Zhang J, Xu Y M. Monitoring of processes with multiple operating modes through multiple principle compo- nent analysis models [ J ]. Industrial and Engineering Chemis- try Research ,2004,43 (22) :7025 - 7035.
  • 7Zhu Z B, Song Z H, Palazoglu A. Transition process modeling and monitoring based on dynamic ensemble clustering and multiclass support vector data description [ J ]. Ind. Eng. Chem. Res,2011,50(24) : 13969 - 13983.
  • 8Baudat G, Anouar F. Generalized discriminant analysis using a kernel approach [ J ]. Neural Computation, 2000,12 ( 10 ) : 2385 - 2404.
  • 9Verdier G, Ferreira A. Adaptive Mahalanobis distance and k- Nearest neighbor rule for fault detection in semiconductor manufacturing [ J ]. IEEE Transactions on Semiconductor Manufacturing,2011,24 ( 1 ) :59 - 68.
  • 10Beaver S, Palazoglu A. Cluster analysis for autocorrelated and cyclic chemical process data [ J ]. Ind. Eng. Chem. Res. ,2007,46( 11 ) :3610 - 3622.

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部