Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization

Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization

导出

摘要 How to extract robust feature is an important research topic in machine learning community. In this paper, we investigate robust feature extraction for speech signal based on tensor structure and develop a new method called constrained Nonnegative Tensor Factorization （cNTF）. A novel feature extraction framework based on the cortical representation in primary auditory cortex （A1） is proposed for robust speaker recognition. Motivated by the neural firing rates model in A1, the speech signal first is represented as a general higher order tensor, cNTF is used to learn the basis functions from multiple interrelated feature subspaces and find a robust sparse representation for speech signal. Computer simulations are given to evaluate the performance of our method and comparisons with existing speaker recognition methods are also provided. The experimental results demonstrate that the proposed method achieves higher recognition accuracy in noisy environment. How to extract robust feature is an important research topic in machine learning community. In this paper, we investigate robust feature extraction for speech signal based on tensor structure and develop a new method called constrained Nonnegative Tensor Factorization （cNTF）. A novel feature extraction framework based on the cortical representation in primary auditory cortex （A1） is proposed for robust speaker recognition. Motivated by the neural firing rates model in A1, the speech signal first is represented as a general higher order tensor, cNTF is used to learn the basis functions from multiple interrelated feature subspaces and find a robust sparse representation for speech signal. Computer simulations are given to evaluate the performance of our method and comparisons with existing speaker recognition methods are also provided. The experimental results demonstrate that the proposed method achieves higher recognition accuracy in noisy environment.

作者吴强张丽清石光川

机构地区 Department of Computer Science and Engineering

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期783-792,共10页 计算机科学技术学报（英文版）

基金 supported by the National Natural Science Foundation of China under Grant No.60775007 the National Basic Research 973 Program of China under Grant No.2005CB724301 the Science and Technology Commission of Shanghai Municipality under Grant No.08511501701

关键词 pattern recognition speaker recognition nonnegative tensor factorization feature extraction auditory perception pattern recognition, speaker recognition, nonnegative tensor factorization, feature extraction, auditory perception

分类号 TN912.34 [电子电信—通信与信息系统] U666.7 [交通运输工程—船舶及航道工程]

引文网络
相关文献

参考文献40

1Rabiner L R, Juang B. Fundamentals on Speech Recognition. New Jersey: Prentice Hall, 1996.
2Hermansky H. Perceptual linear predictive (PLP) analysis of speech. The Journal of the Acoustical Society of America, 1990, 87(4): 1738-1752.
3Reynolds D A, Rose R C. Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech and Audio Processing, , 1995, 3(1): 72-83.
4Hermansky H, Morgan N. RASTA processing of speech. IEEE Trans. Speech and Audio Processing, 1994, 2(4): 578-589.
5Reynolds D A. Experimental evaluation of features for robust speaker identification. IEEE Trans. Speech and Audio Processing, 1994, 2(4): 639-643.
6Mammone R, Zhang X, Ramachandran R P. Robust speaker recognition: A feature-based approach. IEEE Signal Process. Mag, 1996, 13(5): 58-71.
7Van Vuuren S. Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch. In Proc. ICSLP1996, Oct. 3-6, 1996, Vol.3, pp.1788-1791.
8Berouti M, Schwartz R, Makhoul J, Beranek B, Newman I, Cambridge M A. Enhancement of speech corrupted by acoustic noise. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing ( ICASSP 1979), Washington DC, USA, April 2-4, 1979, Vol.4, pp.208-211.
9Wu M Y, Wang D L. A two-stage algorithm for one- microphone reverberant speech enhancement. IEEE Transactions on Speech and Audio Processing, 2006, 14(3): 774-784.
10Hu Y, Loizou P C. A perceptually motivated subspace approach for speech enhancement. In Proc. the Seventh International Conference on Spoken Language Processing, Denver, USA, Sept. 15-20, 2002.

1HAN Zhiyan,WANG Jian,WANG Xu,LUN Shuxian.Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC and CCBC[J].Chinese Journal of Electronics,2011,20(1):105-110. 被引量：2
2张家谋.电影与电视技术的结合点——数字化(下)[J].影视技术,1994(5):22-25.
3TAN JianDong,WANG Qi,WANG ZhaoCheng.Modified PTS-based PAPR reduction for ACO-OFDM in visible light communications[J].Science China Chemistry,2015,58(12):210-212.
4Guanglu ZHOU,Liqun QI,Soon-Yi WU.Efficient algorithms for computing the largest eigenvalue of a nonnegative tensor[J].Frontiers of Mathematics in China,2013,8(1):155-168. 被引量：2
5陈玲,孔浩.线性非负约束的稀疏信号重建[J].指挥信息系统与技术,2016,7(4):97-99. 被引量：1
6德国ADAM（雅登）Beta全有源版音箱[J].视听前线,2007(3):81-81.
7宋诗莹.近距离通信中的天线小型化技术分析[J].电子制作,2015,23(11X). 被引量：2
8李轶南,张雄伟,曾理,黄建军.改进的稀疏字典学习单通道语音增强算法[J].信号处理,2014,30(1):44-50. 被引量：12
9苏媛媛,李文刚,左旭乾,李艳彦.Android在高校学生信息服务系统中的应用研究[J].硅谷,2014,7(10):57-57. 被引量：2
10毛春升.附加气室汽车空气悬架系统仿真及影响因素分析[J].轻型汽车技术,2006(4):9-11.

Journal of Computer Science & Technology

2010年第4期

浏览历史

内容加载中请稍等...

Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization

参考文献40

相关作者

相关机构

相关主题

浏览历史