用于音乐和语音的识别方法不适用于环境音的识别。提出一种基于MFCC(Mel频率倒谱系数)-SVM(支持向量机)的方法,使用特征表示和学习优化共同来实现办公室10种环境音的分类。环境音数据使用的是IEEE Audio and Acoustic Signal Processing...用于音乐和语音的识别方法不适用于环境音的识别。提出一种基于MFCC(Mel频率倒谱系数)-SVM(支持向量机)的方法,使用特征表示和学习优化共同来实现办公室10种环境音的分类。环境音数据使用的是IEEE Audio and Acoustic Signal Processing(AASP)Challenge Dataset下载的标准数据集。在分析和优化SVM参数过程中,通过改变Mel系数参数的个数,充分考虑有效的MFCC特征表示。实验结果表明,使用MFCC特征和SVM分类器,采用5-折交叉验证的测试方法,得到的平均分类准确率可达88.05%,分类效果明显优于默认的MFCC-SVM算法。展开更多
The Mel-frequency cepstral coefficient (MFCC) is the most widely used feature in speech and speaker recognition. However, MFCC is very sensitive to noise interference, which tends to drastically de- grade the perfor...The Mel-frequency cepstral coefficient (MFCC) is the most widely used feature in speech and speaker recognition. However, MFCC is very sensitive to noise interference, which tends to drastically de- grade the performance of recognition systems because of the mismatches between training and testing. In this paper, the logarithmic transformation in the standard MFCC analysis is replaced by a combined function to improve the noisy sensitivity. The proposed feature extraction process is also combined with speech en- hancement methods, such as spectral subtraction and median-filter to further suppress the noise. Experi- ments show that the proposed robust MFCC-based feature significantly reduces the recognition error rate over a wide signal-to-noise ratio range.展开更多
重音是语言交流中不可或缺的部分,在语言交流中扮演着非常重要的角色。为了验证基于听觉模型的短时谱特征集在汉语重音检测方法中的应用效果,使用MFCC(Mel frequency cepstrum coefficient)和RASTAPLP(relative spectra perceptual line...重音是语言交流中不可或缺的部分,在语言交流中扮演着非常重要的角色。为了验证基于听觉模型的短时谱特征集在汉语重音检测方法中的应用效果,使用MFCC(Mel frequency cepstrum coefficient)和RASTAPLP(relative spectra perceptual linear prediction)算法提取每个语音段的短时谱信息,分别构建了基于MFCC算法的短时谱特征集和基于RASTA-PLP算法的短时谱特征集;选用NaiveBayes分类器对这两类特征集进行建模,把具有最大后验概率的类作为该对象所属的类,这种分类方法充分利用了当前语音段的相关语音特性;基于MFCC的短时谱特征集和基于RASTA-PLP的短时谱特征集在ASCCD(annotated speech corpus of Chinese discourse)上能够分别得到82.1%和80.8%的汉语重音检测正确率。实验结果证明,基于MFCC的短时谱特征和基于RASTA-PLP的短时谱特征能用于汉语重音检测研究。展开更多
为提高水下蛙人呼吸声识别的准确度,提出一种基于Mel频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)的蛙人呼吸声信号特征匹配方法。计算呼吸声信号之间、信号与环境噪声及舰船辐射噪声的MFCC夹角和MFCC距离并进行匹配比较,以...为提高水下蛙人呼吸声识别的准确度,提出一种基于Mel频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)的蛙人呼吸声信号特征匹配方法。计算呼吸声信号之间、信号与环境噪声及舰船辐射噪声的MFCC夹角和MFCC距离并进行匹配比较,以进行分类识别。某湖试验数据的处理结果表明:蛙人呼吸声与舰船辐射噪声及环境噪声的MFCC参数有着明显的差异,能够对蛙人呼吸声信号与干扰噪声进行区分,证明了基于MFCC特征算法的有效性,对发展港口、码头等近海海域附近的水下蛙人探测声呐和预警系统具有实际意义。展开更多
文摘用于音乐和语音的识别方法不适用于环境音的识别。提出一种基于MFCC(Mel频率倒谱系数)-SVM(支持向量机)的方法,使用特征表示和学习优化共同来实现办公室10种环境音的分类。环境音数据使用的是IEEE Audio and Acoustic Signal Processing(AASP)Challenge Dataset下载的标准数据集。在分析和优化SVM参数过程中,通过改变Mel系数参数的个数,充分考虑有效的MFCC特征表示。实验结果表明,使用MFCC特征和SVM分类器,采用5-折交叉验证的测试方法,得到的平均分类准确率可达88.05%,分类效果明显优于默认的MFCC-SVM算法。
基金Supported by the National Natural Science Foundation of China(No. 6007201)
文摘The Mel-frequency cepstral coefficient (MFCC) is the most widely used feature in speech and speaker recognition. However, MFCC is very sensitive to noise interference, which tends to drastically de- grade the performance of recognition systems because of the mismatches between training and testing. In this paper, the logarithmic transformation in the standard MFCC analysis is replaced by a combined function to improve the noisy sensitivity. The proposed feature extraction process is also combined with speech en- hancement methods, such as spectral subtraction and median-filter to further suppress the noise. Experi- ments show that the proposed robust MFCC-based feature significantly reduces the recognition error rate over a wide signal-to-noise ratio range.
文摘重音是语言交流中不可或缺的部分,在语言交流中扮演着非常重要的角色。为了验证基于听觉模型的短时谱特征集在汉语重音检测方法中的应用效果,使用MFCC(Mel frequency cepstrum coefficient)和RASTAPLP(relative spectra perceptual linear prediction)算法提取每个语音段的短时谱信息,分别构建了基于MFCC算法的短时谱特征集和基于RASTA-PLP算法的短时谱特征集;选用NaiveBayes分类器对这两类特征集进行建模,把具有最大后验概率的类作为该对象所属的类,这种分类方法充分利用了当前语音段的相关语音特性;基于MFCC的短时谱特征集和基于RASTA-PLP的短时谱特征集在ASCCD(annotated speech corpus of Chinese discourse)上能够分别得到82.1%和80.8%的汉语重音检测正确率。实验结果证明,基于MFCC的短时谱特征和基于RASTA-PLP的短时谱特征能用于汉语重音检测研究。
文摘为提高水下蛙人呼吸声识别的准确度,提出一种基于Mel频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)的蛙人呼吸声信号特征匹配方法。计算呼吸声信号之间、信号与环境噪声及舰船辐射噪声的MFCC夹角和MFCC距离并进行匹配比较,以进行分类识别。某湖试验数据的处理结果表明:蛙人呼吸声与舰船辐射噪声及环境噪声的MFCC参数有着明显的差异,能够对蛙人呼吸声信号与干扰噪声进行区分,证明了基于MFCC特征算法的有效性,对发展港口、码头等近海海域附近的水下蛙人探测声呐和预警系统具有实际意义。