期刊文献+

多类型语音特征进化选择算法

Multiple voice features types evolutionary selection algorithm
下载PDF
导出
摘要 基于特征选择的语音特征获取用于说话人识别是目前较为有效的方式。但是,最优语音特征随着具体应用环境的变化而不同。因此,提出了基于四类型语音特征封装式遗传特征选择算法(FSF-Wr GAF),该算法提取了四种类型的语音特征参数,通过链式智能体遗传算法和GMM-UBM进行封装式动态特征选择,获取高精度的识别准确率。采用了多种指标完成该算法的性能测试。实验结果表明,该算法具体实现过程简便,改进效果明显,较同类算法在多项指标(识别率,EER,DET曲线)上都有显著提高。 Speech feature extraction based on feature selection is a very effective method for speaker recognition. However, the optimal speech features have also changed. Therefore, this paper proposes a kind of four kinds of speech feature wrapper selection framework algorithm(FSF-WrGAF). The algorithm extracts four kinds of speech features, and conducts dynamic wrapper feature selection by Chainlike Agent Genetic Algorithm(CAGA)and Gaussian Mixture Model-Universal Back-ground Model(GMM-UBM), thereby obtaining high recognition accuracy. Several algorithms are compared in the experiment part. Experimental results show that the FSF-WrGAF algorithm can obtain apparent improvement in terms of accuracy, equal error rate and detection cost compared with some other algorithms.
出处 《计算机工程与应用》 CSCD 北大核心 2016年第14期150-155,219,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.91438104) 中央高校基本科研业务费专项资金(No.CDJZR10160003 No.CDJZR13160008 No.CDJZR155507) 中国博士后科学基金(No.2013M532153) 重庆市博士后科研项目特别资助
关键词 说话人识别 多类型语音特征 链式智能体遗传算法 伽马通滤波器倒谱系数(GFCC) 梅尔频率倒谱系数(MFCC) 线性预测倒谱系数(LPCC) speaker recognition multiple voice features types chain-like agent genetic algorithm Gammatone Frequency Cepstrum Coefficient(GFCC) Mel Frequency Cepstrum Coefficient(MFCC) Linear Prediction Cepstrum Coefficient(LPCC)
  • 相关文献

参考文献20

  • 1何俊,李艳雄,贺前华,李威.变异特征加权的异常语音说话人识别算法[J].华南理工大学学报(自然科学版),2012,40(3):106-111. 被引量:5
  • 2Abd Almisreb A,Abidin A F,Md Tahir N.Comparison ofspeech features for Arabic phonemes recognition systembased Malay speakers systems[C]//Proceedings of 2014IEEE Conference on Process and Control,2014:79-83.
  • 3McLaren M,Scheffer N,Ferrer L,et al.Effective use ofDCTS for contextualizing features for speaker recognition[C]//Proceedings of 2014 IEEE International Conference onAcoustics,Speech and Signal Processing,2014:4027-4031.
  • 4Zhao Xiaojia,Wang Deliang.Analyzing noise robustness ofMFCC and GFCC features in speaker identification[C]//Proceedings of 2013 IEEE International Conference onAcoustics,Speech and Signal Processing,2013:7204-7208.
  • 5Huang C L,Tsao Y,Hori C.et al.Feature normalizationand selection for robust speaker state recognition[C]//Proceedingsof 2011 International Conference on Speech Databaseand Assessments,2011:102-105.
  • 6Chiou B C,Chen C P.Feature space dimension reductionin speech emotion recognition using support vector machine[C]//Proceedings of 2013 Asia-Pacific Signal andInformation Processing Association Annual Summit andConference,2013:1-6.
  • 7Wu Tingyao,Duchateau J,Martens J P,et al.Feature subsetselection for improved native accent identification[J].Speech Communication,2010,52(2):83-98.
  • 8Harrag A,Saigaa D,Boukharouba K,et al.GA-based featuresubset selection:application to Arabic speaker recognitionsystem[C]//Proceedings of 2011 11th International Conferenceon Hybrid Intelligent Systems,2011:383-387.
  • 9Wang Mengjun,Wang Xiangling,Li Gang,et al.A improvedspeech synthesis system utilizing BPSO-based lip featureselection[C]//Proceedings of 2011 4th International Conferenceon Biomedical Engineering and Informatics,2011,3:1292-1295.
  • 10Nemati S,Basiri M E.Text-independent speaker verificationusing ant colony optimization-based selected features[J].Expert Systems with Applications,2011,38(1):620-630.

二级参考文献27

  • 1Rashid R A,Mahalin N H,Sarijari M A,et al.Securitysystem using biometric technology design and implementa-tion of voice recognition system[C]∥Proceedings of In-ternational Conference on Computer and CommunicationEngineering.Kuala Lumpur:IEEE,2008:898-902.
  • 2Alpan A,Maryn Y,Kacha A,et al.Multi-band dysperio-dicity analyses of disordered connected speech[J].SpeechCommunication,2011,53(1):131-141.
  • 3Maciel C D,Pereira J C,Stewart D.Identifying healthyand pathologically affected voice signals[J].IEEE SignalProcessing Magazine,2010,27(1):120-123.
  • 4Togneri R,Pullella D.An overview of speaker identifica-tion:accuracy and robustness issues[J].Circuits andSystems Magazine,2011,11(2):23-61.
  • 5Garner Philip N.Cepstral normalisation and the signal tonoise ratio spectrum in automatic speech recognition[J].Speech Communication,2011,53(8):991-1001.
  • 6Yang Hong-wu,Liu Ya-li,Huang De-zhi.Speaker recogni-tion based on beighted Mel-cepstrum[C]∥Proceedingsof the Fourth International Conference on Computer Sci-ences and Convergence Information Technology.Seoul:IEEE,2009:200-203.
  • 7Weng Zufeng,Li Lin,Guo Donghui.Speaker recognitionusing weighted dynamic MFCC based on GMM[C]∥Proceedings of International Conference on Anti-Counter-feiting Security and Identification in Communication.Chendu:IEEE,2010:285-288.
  • 8Kullback S,Leibler R.On information and sufficiency[J].Annals of Mathematical Statistics,1951,30(3):79-86.
  • 9You Chang Huai,Lee Kong Aik,Li Haizhou.GMM-SVMkernel with a bhattacharyya-based distance for speakerrecognition[J].IEEE Transactions on Audio,Speech,and Language Processing,2010,18(6):1300-1312.
  • 10Ferrante A,Ramponi F,Ticozzi F.On the convergence ofan efficient algorithm for kullback-leibler approximationof spectral densities[J].IEEE Transactions on Auto-matic Control,2011,56(3):506-515.

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部