期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
A Comparison of Classifiers in Performing Speaker Accent Recognition Using MFCCs
1
作者 Zichen Ma Ernest Fokoué 《Open Journal of Statistics》 2014年第4期258-266,共9页
An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC... An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation. 展开更多
关键词 SPEAKER ACCENT RECOGNITION mel-frequency cepstral Coefficients (mfccs) DISCRIMINANT Analysis Support Vector Machines (SVMs) k-Nearest NEIGHBORS
下载PDF
Challenges and Limitations in Speech Recognition Technology:A Critical Review of Speech Signal Processing Algorithms,Tools and Systems
2
作者 Sneha Basak Himanshi Agrawal +4 位作者 Shreya Jena Shilpa Gite Mrinal Bachute Biswajeet Pradhan Mazen Assiri 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期1053-1089,共37页
Speech recognition systems have become a unique human-computer interaction(HCI)family.Speech is one of the most naturally developed human abilities;speech signal processing opens up a transparent and hand-free computa... Speech recognition systems have become a unique human-computer interaction(HCI)family.Speech is one of the most naturally developed human abilities;speech signal processing opens up a transparent and hand-free computation experience.This paper aims to present a retrospective yet modern approach to the world of speech recognition systems.The development journey of ASR(Automatic Speech Recognition)has seen quite a few milestones and breakthrough technologies that have been highlighted in this paper.A step-by-step rundown of the fundamental stages in developing speech recognition systems has been presented,along with a brief discussion of various modern-day developments and applications in this domain.This review paper aims to summarize and provide a beginning point for those starting in the vast field of speech signal processing.Since speech recognition has a vast potential in various industries like telecommunication,emotion recognition,healthcare,etc.,this review would be helpful to researchers who aim at exploring more applications that society can quickly adopt in future years of evolution. 展开更多
关键词 Speech recognition automatic speech recognition(ASR) mel-frequency cepstral coefficients(mfcc) hidden Markov model(HMM) artificial neural network(ANN)
下载PDF
公共场所典型异常声音的特征提取 被引量:16
3
作者 栾少文 龚卫国 《计算机工程》 CAS CSCD 北大核心 2010年第7期208-210,共3页
针对采用梅尔倒谱系数(MFCC)表征异常声音时识别率低下问题,提出获取MFCC的改进方法,包括对公共场所典型异常声音信号的特性分析和MFCC提取过程中滤波器组的重新设计。基于公共场所异常声音数据库的实验结果表明,与MFCC特征提取方法相比... 针对采用梅尔倒谱系数(MFCC)表征异常声音时识别率低下问题,提出获取MFCC的改进方法,包括对公共场所典型异常声音信号的特性分析和MFCC提取过程中滤波器组的重新设计。基于公共场所异常声音数据库的实验结果表明,与MFCC特征提取方法相比,该方法提高了特征参数在识别系统中的效率,具有一定的优越性和实用性。 展开更多
关键词 异常声音 梅尔倒谱系数 滤波器组 隐马尔可夫模型 特征提取
下载PDF
基于余弦相似度的动态语音特征提取算法 被引量:9
4
作者 艾佳琪 左毅 +3 位作者 刘君霞 贺培超 李铁山 陈俊龙 《计算机应用研究》 CSCD 北大核心 2020年第S02期147-149,共3页
为进一步研究语音特征提取方法,分析了基于逆离散余弦变换倒谱系数(IDCT CC)的语音特征,利用频域语音信号间的余弦相似度(cosine similarity)特性将IDCT CC进行层次聚类,得到14维频域语音特征向量(feature vector),称之为C-vector。实验... 为进一步研究语音特征提取方法,分析了基于逆离散余弦变换倒谱系数(IDCT CC)的语音特征,利用频域语音信号间的余弦相似度(cosine similarity)特性将IDCT CC进行层次聚类,得到14维频域语音特征向量(feature vector),称之为C-vector。实验中,建立基于高斯混合模型(Gaussian mixture model,GMM)的说话人识别模型对C-vector进行识别精度和时间的讨论,并与经典的梅尔频率倒谱系数和等频域倒谱系数(histogram of DCT cepstrum coefficients,HDCC)进行对比实验。通过具体的实验结果比较,提出的C-vector在识别精度方面比MFCC和HDCC分别高出7%和5%。而且,C-vector在多人语音集下表现出的识别能力更为优异。 展开更多
关键词 说话人识别 语音特征 梅尔频率倒谱系数(mel-frequency cepstral coefficients mfcc) 逆离散余弦变换倒谱系数(inrerse discrete cosine tromsform cepstrwm coefficient IDCT CC) 余弦相似度 层次聚类分析
下载PDF
Environmental Sound Classification Using Deep Learning 被引量:7
5
作者 SHANTHAKUMAR S SHAKILA S +1 位作者 SUNETH Pathirana JAYALATH Ekanayake 《Instrumentation》 2020年第3期15-22,共8页
Perhaps hearing impairment individuals cannot identify the environmental sounds due to noise around them.However,very little research has been conducted in this domain.Hence,the aim of this study is to categorize soun... Perhaps hearing impairment individuals cannot identify the environmental sounds due to noise around them.However,very little research has been conducted in this domain.Hence,the aim of this study is to categorize sounds generated in the environment so that the impairment individuals can distinguish the sound categories.To that end first we define nine sound classes--air conditioner,car horn,children playing,dog bark,drilling,engine idling,jackhammer,siren,and street music--typically exist in the environment.Then we record 100 sound samples from each category and extract features of each sound category using Mel-Frequency Cepstral Coefficients(MFCC).The training dataset is developed using this set of features together with the class variable;sound category.Sound classification is a complex task and hence,we use two Deep Learning techniques;Multi Layer Perceptron(MLP)and Convolution Neural Network(CNN)to train classification models.The models are tested using a separate test set and the performances of the models are evaluated using precision,recall and F1-score.The results show that the CNN model outperforms the MLP.However,the MLP also provided a decent accuracy in classifying unknown environmental sounds. 展开更多
关键词 mel-frequency cepstral Coefficients mfcc Multi-Layer Perceptron MLP Convolutional Neural Network CNN
下载PDF
Application of formant instantaneous characteristics to speech recognition and speaker identification
6
作者 侯丽敏 胡晓宁 谢娟敏 《Journal of Shanghai University(English Edition)》 CAS 2011年第2期123-127,共5页
This paper proposes a new phase feature derived from the formant instantaneous characteristics for speech recognition (SR) and speaker identification (SI) systems. Using Hilbert transform (HT), the formant chara... This paper proposes a new phase feature derived from the formant instantaneous characteristics for speech recognition (SR) and speaker identification (SI) systems. Using Hilbert transform (HT), the formant characteristics can be represented by instantaneous frequency (IF) and instantaneous bandwidth, namely formant instantaneous characteristics (FIC). In order to explore the importance of FIC both in SR and SI, this paper proposes different features from FIC used for SR and SI systems. When combing these new features with conventional parameters, higher identification rate can be achieved than that of using Mel-frequency cepstral coefficients (MFCC) parameters only. The experiment results show that the new features are effective characteristic parameters and can be treated as the compensation of conventional parameters for SR and SI. 展开更多
关键词 instantaneous frequency (IF) Hilbert transform (HT) speech recognition speaker identification mel-frequency cepstral coefficients (mfcc
下载PDF
静态MFCC特征的性别差异性研究
7
作者 杨继臣 吴裕玲 苏杰华 《仲恺农业工程学院学报》 CAS 2011年第4期54-56,59,共4页
从男性、女性的静态美尔倒谱系数(Mel-frequency cepstral coefficients,MFCC)特征概率密度函数的峰值差异、平均值和方差等方面研究了静态MFCC特征的性别差异性.结果表明,在峰值方面,MFCC1、MFCC2、MFCC6、MFCC9和MFCC12的差异最大;在... 从男性、女性的静态美尔倒谱系数(Mel-frequency cepstral coefficients,MFCC)特征概率密度函数的峰值差异、平均值和方差等方面研究了静态MFCC特征的性别差异性.结果表明,在峰值方面,MFCC1、MFCC2、MFCC6、MFCC9和MFCC12的差异最大;在均值方面,男性MFCC特征分量大于女性MFCC特征分量;在方差方面,大部分男性MFCC特征分量小于女性MFCC特征分量. 展开更多
关键词 mfcc(mel-frequency cepstral coefficients)特征 性别差异 峰值差异 平均值 方差
下载PDF
Improved MFCC-Based Feature for Robust Speaker Identification 被引量:7
8
作者 吴尊敬 曹志刚 《Tsinghua Science and Technology》 SCIE EI CAS 2005年第2期158-161,共4页
The Mel-frequency cepstral coefficient (MFCC) is the most widely used feature in speech and speaker recognition. However, MFCC is very sensitive to noise interference, which tends to drastically de- grade the perfor... The Mel-frequency cepstral coefficient (MFCC) is the most widely used feature in speech and speaker recognition. However, MFCC is very sensitive to noise interference, which tends to drastically de- grade the performance of recognition systems because of the mismatches between training and testing. In this paper, the logarithmic transformation in the standard MFCC analysis is replaced by a combined function to improve the noisy sensitivity. The proposed feature extraction process is also combined with speech en- hancement methods, such as spectral subtraction and median-filter to further suppress the noise. Experi- ments show that the proposed robust MFCC-based feature significantly reduces the recognition error rate over a wide signal-to-noise ratio range. 展开更多
关键词 mel-frequency cepstral coefficient (mfcc) robust speaker identification feature extraction
原文传递
English Speech Recognition System on Chip
9
作者 刘鸿 钱彦旻 刘加 《Tsinghua Science and Technology》 SCIE EI CAS 2011年第1期95-99,共5页
An English speech recognition system was implemented on a chip, called speech system-on-chip (SoC). The SoC included an application specific integrated circuit with a vector accelerator to improve performance. The s... An English speech recognition system was implemented on a chip, called speech system-on-chip (SoC). The SoC included an application specific integrated circuit with a vector accelerator to improve performance. The sub-word model based on a continuous density hidden Markov model recognition algorithm ran on a very cheap speech chip. The algorithm was a two-stage fixed-width beam-search baseline system with a variable beam-width pruning strategy and a frame-synchronous word-level pruning strategy to significantly reduce the recognition time. Tests show that this method reduces the recognition time nearly 6 fold and the memory size nearly 2 fold compared to the original system, with less than 1% accuracy degradation for a 600 word recognition task and recognition accuracy rate of about 98%. 展开更多
关键词 non-specific human voice-consciousness SYSTEM-ON-CHIP mel-frequency cepstral coefficients (mfcc
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部