期刊文献+
共找到123篇文章
< 1 2 7 >
每页显示 20 50 100
A HMM-based Mandarin Chinese Singing Voice Synthesis System 被引量:4
1
作者 Xian Li Zengfu Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI 2016年第2期192-202,共11页
We propose a mandarin Chinese singing voice synthesis system, in which hidden Markov model (HMM)-based speech synthesis technique is used. A mandarin Chinese singing voice corpus is recorded and musical contextual fea... We propose a mandarin Chinese singing voice synthesis system, in which hidden Markov model (HMM)-based speech synthesis technique is used. A mandarin Chinese singing voice corpus is recorded and musical contextual features are well designed for training. F0 and spectrum of singing voice are simultaneously modeled with context-dependent HMMs. There is a new problem, F0 of singing voice is always sparse because of large amount of context, i.e., tempo and pitch of note, key, time signature and etc. So the features hardly ever appeared in the training data cannot be well obtained. To address this problem, difference between F0 of singing voice and that of musical score (DF0) is modeled by a single Viterbi training. To overcome the over-smoothing of the generated F0 contour, syllable level F0 model based on discrete cosine transforms (DCT) is applied, F0 contour is generated by integrating two-level statistical models. The experimental results demonstrate that the proposed system outperforms the baseline system in both objective and subjective evaluations. The proposed system can generate a more natural F0 contour. Furthermore, the syllable level F0 model can make singing voice more expressive. © 2014 Chinese Association of Automation. 展开更多
关键词 Cosine transforms hidden markov models markov processes speech synthesis
下载PDF
HMM-Based Photo-Realistic Talking Face Synthesis Using Facial Expression Parameter Mapping with Deep Neural Networks
2
作者 Kazuki Sato Takashi Nose Akinori Ito 《Journal of Computer and Communications》 2017年第10期50-65,共16页
This paper proposes a technique for synthesizing a pixel-based photo-realistic talking face animation using two-step synthesis with HMMs and DNNs. We introduce facial expression parameters as an intermediate represent... This paper proposes a technique for synthesizing a pixel-based photo-realistic talking face animation using two-step synthesis with HMMs and DNNs. We introduce facial expression parameters as an intermediate representation that has a good correspondence with both of the input contexts and the output pixel data of face images. The sequences of the facial expression parameters are modeled using context-dependent HMMs with static and dynamic features. The mapping from the expression parameters to the target pixel images are trained using DNNs. We examine the required amount of the training data for HMMs and DNNs and compare the performance of the proposed technique with the conventional PCA-based technique through objective and subjective evaluation experiments. 展开更多
关键词 Visual-speech synthesis TALKING Head hidden markov models (hmms) Deep Neural Networks (DNNs) FACIAL Expression Parameter
下载PDF
Fuzzy C-Means Clustering Based Phonetic Tied-Mixture HMM in Speech Recognition 被引量:1
3
作者 徐向华 朱杰 郭强 《Journal of Shanghai Jiaotong university(Science)》 EI 2005年第1期16-20,共5页
A fuzzy clustering analysis based phonetic tied-mixture HMM(FPTM) was presented to decrease parameter size and improve robustness of parameter training. FPTM was synthesized from state-tied HMMs by a modified fuzzy C-... A fuzzy clustering analysis based phonetic tied-mixture HMM(FPTM) was presented to decrease parameter size and improve robustness of parameter training. FPTM was synthesized from state-tied HMMs by a modified fuzzy C-means clustering algorithm. Each Gaussian codebook of FPTM was built from Gaussian components within the same root node in phonetic decision tree. The experimental results on large vocabulary Mandarin speech recognition show that compared with conventional phonetic tied-mixture HMM and state-tied HMM with approximately the same number of Gaussian mixtures, FPTM achieves word error rate reductions by 4.84% and 13.02% respectively. Combining the two schemes of mixing weights pruning and Gaussian centers fuzzy merging, a significantly parameter size reduction was achieved with little impact on recognition accuracy. 展开更多
关键词 speech recognition hidden markov model (hmm) fuzzy C-means (FCM) phonetic decision tree
下载PDF
Hidden Markov Models for Automatic Speech Recognition
4
作者 Mbarki Aymen Ammari Abdelaziz Sghaier Halim Hassen Maaref 《Journal of Mechanics Engineering and Automation》 2011年第1期68-73,共6页
In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in th... In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in the speech recognition field. Although hidden Markov modeling has significantly improved the performance of current speech-recognition systems, the general problem of completely fluent speaker-independent speech recognition is still far from being solved. For example, there is no system which is capable of reliably recognizing unconstrained conversational speech. Also, there does not exist a good way to infer the language structure from a limited corpus of spoken sentences statistically. Therefore, the authors want to provide an overview of the theory of HMM, discuss the role of statistical methods, and point out a range of theoretical and practical issues that deserve attention and are necessary to understand so as to further advance research in the field of speech recognition. 展开更多
关键词 hidden markov models (hmms) speech recognition hmm problems viterbi algorithm.
下载PDF
Prosodically Rich Speech Synthesis Interface Using Limited Data of Celebrity Voice
5
作者 Takashi Nose Taiki Kamei 《Journal of Computer and Communications》 2016年第16期79-94,共16页
To enhance the communication between human and robots at home in the future, speech synthesis interfaces are indispensable that can generate expressive speech. In addition, synthesizing celebrity voice is commercially... To enhance the communication between human and robots at home in the future, speech synthesis interfaces are indispensable that can generate expressive speech. In addition, synthesizing celebrity voice is commercially important. For these issues, this paper proposes techniques for synthesizing natural-sounding speech that has a rich prosodic personality using a limited amount of data in a text-to-speech (TTS) system. As a target speaker, we chose a well-known prime minister of Japan, Shinzo Abe, who has a good prosodic personality in his speeches. To synthesize natural-sounding and prosodically rich speech, accurate phrasing, robust duration prediction, and rich intonation modeling are important. For these purpose, we propose pause position prediction based on conditional random fields (CRFs), phone-duration prediction using random forests, and mora-based emphasis context labeling. We examine the effectiveness of the above techniques through objective and subjective evaluations. 展开更多
关键词 Parametric speech synthesis hidden markov model (hmm) Prosodic Personality Prosody modeling Conditional Random Field (CRF) Random Forest Emphasis Context
下载PDF
Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus
6
作者 Peiwen Wu Hongwu Yang Zhenye Gan 《国际计算机前沿大会会议论文集》 2017年第2期29-32,共4页
This paper presents a method of hidden Markov model (HMM)-based Mandarin-Tibetan bi-lingual emotional speech synthesis by speaker adaptive training with a Mandarin emotional speech corpus.A one-speaker Tibetan neutral... This paper presents a method of hidden Markov model (HMM)-based Mandarin-Tibetan bi-lingual emotional speech synthesis by speaker adaptive training with a Mandarin emotional speech corpus.A one-speaker Tibetan neutral speech corpus, a multi-speaker Mandarin neutral speech corpus and a multi-speaker Mandarin emotional speech corpus are firstly employed to train a set of mixed language average acoustic models of target emotion by using speaker adaptive training.Then a one-speaker Mandarin neutral speech corpus or a one-speaker Tibetan neutral speech corpus is adopted to obtain a set of speaker dependent acoustic models of target emotion by using the speaker adap-tation transformation. The Mandarin emotional speech or the Tibetan emotional speech is finally synthesized from Mandarin speaker depen-dent acoustic models of target emotion or Tibetan speaker dependent acoustic models of target emotion. Subjective tests show that the aver-age emotional mean opinion score is 4.14 for Tibetan and 4.26 for Mandarin. The average mean opinion score is 4.16 for Tibetan and 4.28 for Mandarin. The average degradation opinion score is 4.28 for Tibetan and 4.24 for Mandarin. Therefore, the proposed method can synthesize both Tibetan speech and Mandarin speech with high naturalness and emotional expression by using only Mandarin emotional training speech corpus. 展开更多
关键词 Mandarin-Tibetan cross-lingual EMOTIONAL speech synthesis hidden markov model (hmm) Speaker adaptive training Mandarin-Tibetan cross-lingual speech synthesis EMOTIONAL speech synthesis
下载PDF
Towards Realizing Sign Language-to-Speech Conversion by Combining Deep Learning and Statistical Parametric Speech Synthesis
7
作者 Xiaochun An Hongwu Yang Zhenye Gan 《国际计算机前沿大会会议论文集》 2016年第1期176-178,共3页
This paper realizes a sign language-to-speech conversion system to solve the communication problem between healthy people and speech disorders. 30 kinds of different static sign languages are firstly recognized by com... This paper realizes a sign language-to-speech conversion system to solve the communication problem between healthy people and speech disorders. 30 kinds of different static sign languages are firstly recognized by combining the support vector machine (SVM) with a restricted Boltzmann machine (RBM) based regulation and a feedback fine-tuning of the deep model. The text of sign language is then obtained from the recognition results. A context-dependent label is generated from the recognized text of sign language by a text analyzer. Meanwhile,a hiddenMarkov model (HMM) basedMandarin-Tibetan bilingual speech synthesis system is developed by using speaker adaptive training.The Mandarin speech or Tibetan speech is then naturally synthesized by using context-dependent label generated from the recognized sign language. Tests show that the static sign language recognition rate of the designed system achieves 93.6%. Subjective evaluation demonstrates that synthesized speech can get 4.0 of the mean opinion score (MOS). 展开更多
关键词 Deep learning Support vector machine Static SIGN language recognition Context-dependent LABEL hidden markov model Mandarin-Tibetan BILINGUAL speech synthesis
下载PDF
Challenges and Limitations in Speech Recognition Technology:A Critical Review of Speech Signal Processing Algorithms,Tools and Systems
8
作者 Sneha Basak Himanshi Agrawal +4 位作者 Shreya Jena Shilpa Gite Mrinal Bachute Biswajeet Pradhan Mazen Assiri 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期1053-1089,共37页
Speech recognition systems have become a unique human-computer interaction(HCI)family.Speech is one of the most naturally developed human abilities;speech signal processing opens up a transparent and hand-free computa... Speech recognition systems have become a unique human-computer interaction(HCI)family.Speech is one of the most naturally developed human abilities;speech signal processing opens up a transparent and hand-free computation experience.This paper aims to present a retrospective yet modern approach to the world of speech recognition systems.The development journey of ASR(Automatic Speech Recognition)has seen quite a few milestones and breakthrough technologies that have been highlighted in this paper.A step-by-step rundown of the fundamental stages in developing speech recognition systems has been presented,along with a brief discussion of various modern-day developments and applications in this domain.This review paper aims to summarize and provide a beginning point for those starting in the vast field of speech signal processing.Since speech recognition has a vast potential in various industries like telecommunication,emotion recognition,healthcare,etc.,this review would be helpful to researchers who aim at exploring more applications that society can quickly adopt in future years of evolution. 展开更多
关键词 speech recognition automatic speech recognition(ASR) mel-frequency cepstral coefficients(MFCC) hidden markov model(hmm) artificial neural network(ANN)
下载PDF
SVM+BiHMM:基于统计方法的元数据抽取混合模型 被引量:27
9
作者 张铭 银平 +1 位作者 邓志鸿 杨冬青 《软件学报》 EI CSCD 北大核心 2008年第2期358-368,共11页
提出了一种SVM+BiHMM的混合元数据自动抽取方法.该方法基于SVM(support vector machine)和二元HMM(bigram HMM(hidden Markov model),简称BiHMM)理论.二元HMM模型BiHMM在保持模型结构不变的前提下,通过区分首发概率和状态内部发射概率,... 提出了一种SVM+BiHMM的混合元数据自动抽取方法.该方法基于SVM(support vector machine)和二元HMM(bigram HMM(hidden Markov model),简称BiHMM)理论.二元HMM模型BiHMM在保持模型结构不变的前提下,通过区分首发概率和状态内部发射概率,修改了HMM发射概率计算模型.在SVM+BiHMM复合模型中,首先根据规则把论文粗分为论文头、正文以及引文部分,然后建立SVM模型把文本块划分为元数据子类,接着采用Sigmoid双弯曲函数把SVM分类结果用于拟合调整BiHMM模型的单词发射概率,最后用复合模型进行元数据抽取.SVM方法有效考虑了块间联系,BiHMM模型充分考虑了单词在状态内部的位置信息,二者的元数据抽取结果得到了很好的互补和修正,实验评测结果表明,SVM+BiHMM算法的抽取效果优于其他方法. 展开更多
关键词 元数据抽取 基于规则的信息抽取 支持向量机 隐马尔科夫模型 二元 hmm模型
下载PDF
基于乘积HMM的双模态语音识别方法 被引量:8
10
作者 赵晖 顾亚强 唐朝京 《计算机工程》 CAS CSCD 北大核心 2010年第8期7-9,共3页
针对噪声环境中的语音识别,提出一种用于双模态语音识别的乘积隐马尔可夫模型(HMM)。在独立训练音频HMM和视频HMM的基础上,建立二维训练模型,表征音频流和视频流之间的异步特性。引入权重系数,根据不同噪声环境自适应调整音频流与视频... 针对噪声环境中的语音识别,提出一种用于双模态语音识别的乘积隐马尔可夫模型(HMM)。在独立训练音频HMM和视频HMM的基础上,建立二维训练模型,表征音频流和视频流之间的异步特性。引入权重系数,根据不同噪声环境自适应调整音频流与视频流的权重。实验结果证明,与其他双模态语音识别方法相比,该方法的识别性能更高。 展开更多
关键词 双模态语音识别 乘积隐马尔可夫模型 异步特性 权重系数
下载PDF
基于模拟退火算法和二阶HMM的Web信息抽取 被引量:7
11
作者 李伟男 李书琴 +2 位作者 景旭 魏露 李新乐 《计算机工程与设计》 CSCD 北大核心 2014年第4期1264-1268,共5页
针对传统隐马尔科夫模型对初值敏感和未考虑历史状态的问题,提出了使用模拟退火算法训练二阶隐马尔科夫模型参数的SA-HMM2。在基于SA-HMM2的Web信息抽取方法中,采用基于视觉的网页分割算法VIPS对网页分块得到状态转移序列,利用提出的SA-... 针对传统隐马尔科夫模型对初值敏感和未考虑历史状态的问题,提出了使用模拟退火算法训练二阶隐马尔科夫模型参数的SA-HMM2。在基于SA-HMM2的Web信息抽取方法中,采用基于视觉的网页分割算法VIPS对网页分块得到状态转移序列,利用提出的SA-HMM2训练算法获取HMM2全局最优模型参数,用改进的Viterbi算法实现了Web信息的抽取。实验结果表明,该方法在平均综合值方面比HMM、GA-HMM分别提高约21%和7%。 展开更多
关键词 WEB信息抽取 隐马尔科夫模型 二阶隐马尔科夫模型 模拟退火算法 基于视觉的网页分割算法
下载PDF
噪声环境中基于HMM模型的语音信号端点检测方法 被引量:12
12
作者 朱杰 韦晓东 《上海交通大学学报》 EI CAS CSCD 北大核心 1998年第10期14-16,共3页
在噪声环境下如何提高语音信号端点检测的准确性是自动语音识别(ASR)研究中的一个重要课题.常用的基于短时能量的端点检测方法对于能量较低的音节或在信噪比较低的环境下,检测性能不够理想.讨论了一种基于HMM模型的语音信号... 在噪声环境下如何提高语音信号端点检测的准确性是自动语音识别(ASR)研究中的一个重要课题.常用的基于短时能量的端点检测方法对于能量较低的音节或在信噪比较低的环境下,检测性能不够理想.讨论了一种基于HMM模型的语音信号端点检测方法.先用训练的方法生成背景噪声和废料的模型,再用Viterbi解码算法对待测信号进行处理,并给出了具体的实现方法.实验测试结果表明,基于HMM的端点检测方法的检测性能接近于人工检测,方法是有效的. 展开更多
关键词 隐马尔可夫模型 端点检测 语音识别 噪声
下载PDF
基于HMM和遗传神经网络的语音识别系统 被引量:14
13
作者 包亚萍 郑骏 武晓光 《计算机工程与科学》 CSCD 北大核心 2011年第4期139-144,共6页
本文提出了一种基于隐马尔可夫(HMM)和遗传算法优化的反向传播网络(GA-BP)的混合模型语音识别方法。该方法首先利用HMM对语音信号进行时序建模,并计算出语音对HMM的输出概率的评分,将得到的概率评分作为优化后反向传播网络的输入,得到... 本文提出了一种基于隐马尔可夫(HMM)和遗传算法优化的反向传播网络(GA-BP)的混合模型语音识别方法。该方法首先利用HMM对语音信号进行时序建模,并计算出语音对HMM的输出概率的评分,将得到的概率评分作为优化后反向传播网络的输入,得到分类识别信息,最后根据混合模型的识别算法作出识别决策。通过Matlab软件对已有的样本数据进行训练和测试。仿真结果表明,由于设计充分利用了HMM时间建模能力强和GA-BP神经网络分类能力强等特点,该混合模型比单纯的HMM具有更强的抗噪性,克服了神经网络的局部最优问题,大大提高了识别的速度,明显改善了语音识别系统的性能。 展开更多
关键词 语音识别 隐马尔可夫模型(hmm) 遗传算法 反向传播网络(BP)
下载PDF
改进的HMM系统在英语语音合成中的研究 被引量:5
14
作者 张雪英 陈洁 孙颖 《太原理工大学学报》 CAS 北大核心 2012年第1期16-19,共4页
根据英语语言所具有的一些特性对HMM模型进行改进,设计出适合英语语音合成的上下文属性集以及用于模型聚类的问题集,提高了其建模和训练效果。此外,借助HTK和Festival等工具,以基频和声道谱参数为训练参数,最终实现此英语语音合成系统... 根据英语语言所具有的一些特性对HMM模型进行改进,设计出适合英语语音合成的上下文属性集以及用于模型聚类的问题集,提高了其建模和训练效果。此外,借助HTK和Festival等工具,以基频和声道谱参数为训练参数,最终实现此英语语音合成系统。从所合成语句的效果来看,合成语音整体稳定流畅,而且节奏感比较强。 展开更多
关键词 语音信号处理 hmm 可训练语音合成 英语合成
下载PDF
基于二阶HMM的中医诊断古文词性标注 被引量:6
15
作者 刘博 杜建强 +3 位作者 聂斌 刘蕾 张鑫 郝竹林 《计算机工程》 CAS CSCD 北大核心 2017年第7期211-216,共6页
针对传统隐马尔可夫模型(HMM)的词性标注存在捕获上下文信息有限的问题,提出一种改进的二阶隐马尔可夫模型。该模型考虑上下文联系,精确标注中医诊断文本。对训练过程中出现数组下溢的问题,采用生词处理及增加比例因子的方法对其加以修... 针对传统隐马尔可夫模型(HMM)的词性标注存在捕获上下文信息有限的问题,提出一种改进的二阶隐马尔可夫模型。该模型考虑上下文联系,精确标注中医诊断文本。对训练过程中出现数组下溢的问题,采用生词处理及增加比例因子的方法对其加以修正。实验结果表明,改进后的二阶HMM比传统HMM模型具有更高的词性标注正确率。 展开更多
关键词 中医诊断古文 词性标注 上下文联系 比例因子 二阶隐马尔可夫模型 生词处理
下载PDF
基于前向-后向HMM的连续语音识别系统的研究 被引量:5
16
作者 于晓明 柏松 《计算机工程与设计》 CSCD 北大核心 2009年第18期4339-4341,共3页
在分析语音识别原理的基础上采用TMS320DM642 DPS芯片,利用前向-后向HMM声学模型和Viterbi算法进行模式训练和识别,设计了一种连续的、小词量的语音识别系统。实验结果表明,该语音识别系统具有较高的识别率和一定程度的鲁棒性,实验室识... 在分析语音识别原理的基础上采用TMS320DM642 DPS芯片,利用前向-后向HMM声学模型和Viterbi算法进行模式训练和识别,设计了一种连续的、小词量的语音识别系统。实验结果表明,该语音识别系统具有较高的识别率和一定程度的鲁棒性,实验室识别率和室外识别率分别达到96.8%及91.2%,该语音识别系统具有良好的实用性和可移植性。 展开更多
关键词 隐马尔可夫模型 语音识别 markov VITERBI算法 语音模型 模式匹配
下载PDF
HMM转移概率的新的重估算法 被引量:5
17
作者 李健 王作英 《电子学报》 EI CAS CSCD 北大核心 2001年第z1期1833-1835,共3页
将隐含马尔可夫模型 (HMM :HiddenMarkovModel)引入到语音识别中来是一个巨大的贡献 .但是在经典的HMM中关于状态转移概率aij(i≠j)与自转移概率aii的独立性假设 ,导致了这个模型的不协调性 .事实上 ,段长分布概率与状态转移概率并非相... 将隐含马尔可夫模型 (HMM :HiddenMarkovModel)引入到语音识别中来是一个巨大的贡献 .但是在经典的HMM中关于状态转移概率aij(i≠j)与自转移概率aii的独立性假设 ,导致了这个模型的不协调性 .事实上 ,段长分布概率与状态转移概率并非相互独立的 ,由其中的一个就可以唯一的确定另外一个 .本文从段长分布概率出发说明了以上关于转移概率独立性假设的不合理性 ,并得到了转移概率新的重估算法 .这个新算法比经典HMM的Baum Welch迭代算法重估转移概率效果更好 ,前者比后者相对误识率下降了大约 5 % . 展开更多
关键词 语音识别 隐含马尔可夫模型 转移概率
下载PDF
从线性预测HMM到一种新的语音识别的混合模型 被引量:3
18
作者 欧智坚 王作英 《电子学报》 EI CAS CSCD 北大核心 2002年第9期1313-1316,共4页
线性预测 HMM(Linear Prediction HMM,LPHMM)并没有象传统 HMM那样引人状态输出独立同分布假设,但实用中识别性能并不佳.通过分析两种HMM的各自优劣,本文提出了一种新的语音识别的混合模型,将语音静态特性(基于传统HMM)和动态特性... 线性预测 HMM(Linear Prediction HMM,LPHMM)并没有象传统 HMM那样引人状态输出独立同分布假设,但实用中识别性能并不佳.通过分析两种HMM的各自优劣,本文提出了一种新的语音识别的混合模型,将语音静态特性(基于传统HMM)和动态特性(基于LPHMM)分别描述又有机结合在一起,更为精确地刻划了真实的语音现象,同时又继承使系统的实现改动很小和较小的计算量.汉语大词汇量非特定人连续语音识别的实验表明,混合模型的识别性能显著好于LPHMM和传统HMM.理论上,本文还给出了LPHMM的一组闭式参数重估公式. 展开更多
关键词 线性预测hmm 语音识别 混合模型 边疆语音识别 隐马尔可夫模型 线性预测 隐马尔可夫模型
下载PDF
基于HMM模型的藏语语音合成研究 被引量:5
19
作者 周雁 赵栋材 《计算机应用与软件》 CSCD 2015年第5期171-174,共4页
针对藏语的语音合成问题,根据藏语的规律和特点,提出一套完整的基于HMM模型的藏语拉萨语语音合成技术解决方案。并对其中的关键技术进行阐述,包括合成前端的语料选择、拉丁转写、分词处理、文本分析,以及后端的韵律标注、声码器技术、... 针对藏语的语音合成问题,根据藏语的规律和特点,提出一套完整的基于HMM模型的藏语拉萨语语音合成技术解决方案。并对其中的关键技术进行阐述,包括合成前端的语料选择、拉丁转写、分词处理、文本分析,以及后端的韵律标注、声码器技术、语音建模、问题集设计等。实验结果表明,基于该方案搭建的藏语语音合成测试系统有较好的综合得分。 展开更多
关键词 hmm 语音合成 藏语
下载PDF
基于BPNN/HMM神经网络的声学模型研究 被引量:2
20
作者 李凡 吴军 黄刚 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2004年第9期9-11,共3页
研制了一种基于BP神经网络和隐马尔可夫模型 (HMM )的混合声学模型 ,BP神经网络的主要功能是把失真语音特征矢量转换成纯净语音特征矢量 ,而HMM则对转换后的纯净语音特征矢量进行分类 ,从模型级补偿的方面来提高语音识别系统的鲁棒性 .... 研制了一种基于BP神经网络和隐马尔可夫模型 (HMM )的混合声学模型 ,BP神经网络的主要功能是把失真语音特征矢量转换成纯净语音特征矢量 ,而HMM则对转换后的纯净语音特征矢量进行分类 ,从模型级补偿的方面来提高语音识别系统的鲁棒性 .讨论了一种基于线性预测的MFCC语音特征提取方法 ,该方法把提取出的失真语音特征矢量作为神经网络的输入 。 展开更多
关键词 BP神经网络 隐马尔可夫模型 BPNN/hmm混合声学模型 鲁棒性语音识别 语音特征参数
下载PDF
上一页 1 2 7 下一页 到第
使用帮助 返回顶部