摘要
奈奎斯特采样下的说话人识别,当为了确保高的识别率而采集较长时间说话人语音时,采样数据量特别大,其中有许多冗余造成了采样资源的浪费,压缩感知理论可以很好地解决此问题。基于压缩感知理论,文中利用行阶梯观测矩阵对信号进行投影,研究了压缩比与识别率的关系,在压缩比为1:2时,保证识别率的同时,使得采样数据量减少为原来的一半。在有噪环境下,将谱减法运用到压缩感知和特征提取过程中,在无需重构时域信号的前提下,直接从已估计的干净语音功率谱中提取具有鲁棒性的特征参数CS-SSMFCC(Compressed Sensing Spectral Subtraction Mel Frequency Cepstral Coefficient)。实验结果表明,与传统的识别参数MFCC(Mel Frequency Cepstral Coefficient)相比,CS-SSMFCC可以有效地提高系统的鲁棒性,具有很好的抗噪性能。
Speaker recognition under Nyquist sampling has got a large amount of data in order to ensure a high recognition rate, resulting in a waste of sampling resources, and compressive sensing theory can solve this problem. Based on compressed sensing theory,it makes use of ladder observation matrix projection in this paper. When the compression ratio is 1 : 2, the system ensures the recognition rate, so that the sample data is reduced to half. Under noisy environment, spectral subtraction is applied in compressed sensing and feature extraction, and feature parameters are extracted directly from estimated clean speech power spectrum CS-SSMFCC ( Compressed Sensing Spectral Subtraction Mel Frequency Cepstral Coefficient). Experimental results show that compared with the traditional identification parameter MFCC (Mel frequency Cepstral Coefficient), CS-SSMFCC based on spectral subtraction under CS framework can effectively improve the robustness of the system, with good anti-noise performance.
出处
《计算机技术与发展》
2016年第3期18-22,共5页
Computer Technology and Development
基金
国家自然科学基金资助项目(61271335)
国家"973"重点基础研究发展计划项目(2011CB302303)
江苏省自然科学基金项目(BK20140891)
关键词
压缩感知
谱减法
特征参数
鲁棒性
compressed sensing
spectral subtraction
feature parameters
robustness