期刊文献+

重录语音检测算法 被引量:5

An Algorithm of Speech Recapture Detection
下载PDF
导出
摘要 非法认证者可通过播放重新录制合法认证者的语音欺骗说话人识别系统以获得进入系统的权限,为社会安全带来威胁。因此,重录语音的检测具有现实的紧迫性,但相关的研究报道仍较缺乏。为此,本文提出一种重录语音的检测算法。该算法以MFCC(Mel-Frequency Cepstral Coefficients,美尔频率倒谱系数)的统计量作为SVM(Support Vector Machine,支持向量机)和KNN(K-Nearest Neighbors,K最近邻)分类方法的特征;除以上两种分类方法外,本文亦考察使用SAE(Sparse Autoencoder,稀疏自动编码器)的检测性能。为模拟现实生活中重录语音的真实情景,本文实验通过不同的录音设备、录音距离及录音环境对算法进行全面的测试。实验结果表明,通过增加重录语音的多样性作为训练可以使该算法的正确率提高到99.67%,达到了较好的检测性能。 Recaptured speech can be used to deceive authentication systems for illegal purposes in speech/audio community,and thus it presents threats to security.Therefore,it is of great significance to investigate detection of recaptured speech.However,the related research efforts are still insufficient.In this paper,we propose an algorithm to detect recaptured speech.The statistics of MFCC(Mel-Frequency Cepstral Coefficients)are employed as the features for SVM(Support Vec-tor Machine)and KNN(K-Nearest Neighbors)classification.Besides,SAE(Sparse Autoencoder)is also used for perform-ance assessment.To simulate the real scenarioes of speech recapture process,varieties of recording devices,distances and environments are taken into consideration in the experiments.Experimental results show that accuracy of99.67%can be a-chieved by increasing the diversity of recaptured speech,indicating a good detection performance of the proposed algorithm.
作者 李山路 王泳 甘俊英 LI Shan-lu;WANG Yong;GAN Jun-ying(School of Information Engineering, Wuyi University, Jiangmen, Guangdong 529020, China;Corresponding Author, School of Electronic and Information, Guangdong Polytechnic Normal University, Guangzhou, Guangdong 510665, China)
出处 《信号处理》 CSCD 北大核心 2017年第1期95-101,共7页 Journal of Signal Processing
基金 国家自然科学基金(61672173 61372193 61072127) 国家自然科学基金(青年科学基金)(61100168) 广东省自然科学基金(S2013010013311 2014A030313623) 广东省普通高校特色创新项目(2015KTSCX083)
关键词 重录语音检测 社会安全 美尔频率倒谱系数 支持向量机 K最近邻 稀疏自动编码器 speech recapture detection social security Mel-frequency cepstral coefficients support vector machine K-nearest neighbors sparse autoencoder
  • 相关文献

参考文献3

二级参考文献10

  • 1俞一彪,王朔中.基于互信息匹配模型的说话人识别[J].声学学报,2004,29(5):462-466. 被引量:8
  • 2许允喜 俞一彪.基于GMM的汉语说话人识别特性分析[J].通信技术,2004,(5).
  • 3F. K. Soong, A. E. Rosenberg, A vector quantization approach to speaker recognition, Proc. of ICASSP, 1985, pp.387-390.
  • 4D. A. Reynolds, R. C. Rose, Robust Text-Independent Speaker Identification Using Gussian Mixture Speaker Models, IEEE Tran 1995,Speech and audio processing,pp72-83.
  • 5L. Wong, M. Russell, Text-dependent speaker verification under noisy conditions parallel model combination, IC-ASSP' 2001.
  • 6Hermansky H. ,Perceptual linear predictive analysis of speech[ J ], Journal of Acoust. Am., vol. 87, no4, pp. 1738-1752,1990.
  • 7Ching-Tang HSIEH, Regular Member and You-Chuang WANG, A Robust Speaker Identification System Based on Wavelet Transform, IEICE Trans. Inf. &Syst. , vol. E84-D,no. 7,2001.
  • 8Mallat S. , Huang W. L. , Singularity detection and processing with wavelet,IEEE Trans. IT-38,2:617-634,1992.
  • 9X. Huang, A. Acero, H. Hon, Spoken Language Processing:A Guide to Theory, Algorithm, and System Development,Prentice Hall,2001.
  • 10徐义芳,张金杰,姚开盛,曹志刚,王勇前.语音增强用于抗噪声语音识别[J].清华大学学报(自然科学版),2001,41(1):41-44. 被引量:15

共引文献30

同被引文献37

引证文献5

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部