期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired
1
作者 Palli Padmini C.Paramasivam +2 位作者 G.Jyothish Lal Sadeen Alharbi Kaustav Bhowmick 《Computers, Materials & Continua》 SCIE EI 2022年第2期4027-4051,共25页
The present work presents a statistical method to translate human voices across age groups,based on commonalities in voices of blood relations.The age-translated voices have been naturalized extracting the blood relat... The present work presents a statistical method to translate human voices across age groups,based on commonalities in voices of blood relations.The age-translated voices have been naturalized extracting the blood relation features e.g.,pitch,duration,energy,using Mel Frequency Cepstrum Coefficients(MFCC),for social compatibility of the voice-impaired.The system has been demonstrated using standard English and an Indian language.The voice samples for resynthesis were derived from 12 families,with member ages ranging from 8–80 years.The voice-age translation,performed using the Pitch synchronous overlap and add(PSOLA)approach,by modulation of extracted voice features,was validated by perception test.The translated and resynthesized voices were correlated using Linde,Buzo,Gray(LBG),and Kekre’s Fast Codebook generation(KFCG)algorithms.For translated voice targets,a strong(θ>∼93%andθ>∼96%)correlation was found with blood relatives,whereas,a weak(θ<∼78%andθ<∼80%)correlation range was found between different families and different gender from same families.The study further subcategorized the sampling and synthesis of the voices into similar or dissimilar gender groups,using a support vector machine(SVM)choosing between available voice samples.Finally,∼96%,∼93%,and∼94%accuracies were obtained in the identification of the gender of the voice sample,the age group samples,and the correlation between the original and converted voice samples,respectively.The results obtained were close to the natural voice sample features and are envisaged to facilitate a near-natural voice for speech-impaired easily. 展开更多
关键词 Blood relations KFCG LBG MFCC vector quantization correlation speech samples same-gender dissimilar gender voice conversion PSOLA SVM
下载PDF
ON USING NON-LINEAR CANONICAL CORRELATION ANALYSIS FOR VOICE CONVERSION BASED ON GAUSSIAN MIXTURE MODEL
2
作者 Jian Zhihua Yang Zhen 《Journal of Electronics(China)》 2010年第1期1-7,共7页
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters fo... Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation. 展开更多
关键词 Speech processing voice conversion Non-Linear Canonical Correlation Analysis(NLCCA) Gaussian Mixture Model(GMM)
下载PDF
AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
3
作者 Zhou Ying Zhang Linghua 《Journal of Electronics(China)》 2011年第4期518-523,共6页
This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using A... This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible. 展开更多
关键词 Gaussian Mixture Models(GMM) Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) TIME-SCALE voice conversion
下载PDF
A NOVEL ALGORITHM FOR VOICE CONVERSION USING CANONICAL CORRELATION ANALYSIS
4
作者 Jian Zhihua Yang Zhen 《Journal of Electronics(China)》 2008年第3期358-363,共6页
A novel algorithm for voice conversion is proposed in this paper. The mapping function of spectral vectors of the source and target speakers is calculated by the Canonical Correlation Analysis (CCA) estimation based o... A novel algorithm for voice conversion is proposed in this paper. The mapping function of spectral vectors of the source and target speakers is calculated by the Canonical Correlation Analysis (CCA) estimation based on Gaussian mixture models. Since the spectral envelope feature remains a majority of second order statistical information contained in speech after Linear Prediction Coding (LPC) analysis, the CCA method is more suitable for spectral conversion than Minimum Mean Square Error (MMSE) because CCA explicitly considers the variance of each component of the spectral vectors during conversion procedure. Both objective evaluations and subjective listening tests are conducted. The experimental results demonstrate that the proposed scheme can achieve better per- formance than the previous method which uses MMSE estimation criterion. 展开更多
关键词 Speech processing voice conversion Canonical Correlation Analysis (CCA)
下载PDF
A modified voice conversion algorithm using compressed sensing 被引量:8
5
作者 JIAN Zhihua WANG Xiangwen 《Chinese Journal of Acoustics》 2014年第3期323-333,共11页
A voice conversion algorithm,which makes use of the information between continuous frames of speech by compressed sensing,is proposed in this paper.According to the sparsity property of the concatenated vector of seve... A voice conversion algorithm,which makes use of the information between continuous frames of speech by compressed sensing,is proposed in this paper.According to the sparsity property of the concatenated vector of several continuous Linear Spectrum Pairs(LSP)in the discrete cosine transformation domain,this paper utilizes compressed sensing to extract the compressed vector from the concatenated LSPs and uses it as the feature vector to train the conversion function.The results of evaluations demonstrate that the performance of this approach can averagely improve 3.21%with the conventional algorithm based on weighted frequency warping when choosing the appropriate numbers of speech frame.The experimental results also illustrate that the performance of voice conversion system can be improved by taking full advantage of the inter-frame information,because those information can make the converted speech remain the more stable acoustic properties which is inherent in inter-frames. 展开更多
关键词 LPCC A modified voice conversion algorithm using compressed sensing GMM LSP
原文传递
IBM Voice Conversion Systems for 2007 TC-STAR Evaluation 被引量:2
6
作者 双志伟 Raimo Bakis 秦勇 《Tsinghua Science and Technology》 SCIE EI CAS 2008年第4期510-514,共5页
This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency wa... This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency warping, fundamental frequency adjustment, spectral envelope equalization, breathiness addition, and duration modification are also used to improve the similarity to the target speaker. The proposed voice conversion method needs only a very small amount of training data for generating the warping function, thereby greatly facilitating its application. Systems based on the proposed method were used for the 2007 TC-STAR intra-lingual voice conversion evaluation for English and Spanish and a cross-lingual voice conversion evaluation for Spanish. The evaluation results show that the proposed method can achieve a much better quality of converted speech than other methods as well as a good balance between quality and similarity. The IBM1 system was ranked No. 1 for English evaluation and No. 2 for Spanish evaluation. Evaluation results also show that the proposed method is a convenient and competitive method for crosslingual voice conversion tasks. 展开更多
关键词 voice conversion frequency warping mapping formants
原文传递
Voice conversion using structured Gaussian mixture model in cepstrum eigenspace 被引量:2
7
作者 LI Yangchun YU Yibiao 《Chinese Journal of Acoustics》 CSCD 2015年第3期325-336,共12页
A new methodology of voice conversion in cepstrum eigenspace based on structured Gaussian mixture model is proposed for non-parallel corpora without joint training. For each speaker, the cepstrum features of speech ar... A new methodology of voice conversion in cepstrum eigenspace based on structured Gaussian mixture model is proposed for non-parallel corpora without joint training. For each speaker, the cepstrum features of speech are extracted, and mapped to the eigenspace which is formed by eigenvectors of its scatter matrix, thereby the Structured Gaussian Mixture Model in the EigenSpace (SGMM-ES) is trained. The source and target speaker's SGMM-ES are matched based on Acoustic Universal Structure (AUS) principle to achieve spectrum transform function. Experimental results show the speaker identification rate of conversion speech achieves 95.25%, and the value of average cepstrum distortion is 1.25 which is 0.8% and 7.3% higher than the performance of SGMM method respectively. ABX and MOS evaluations indicate the conversion performance is quite close to the traditional method under the parallel corpora condition. The results show the eigenspace based structured Gaussian mixture model for voice conversion under the non-parallel corpora is effective. 展开更多
关键词 LPCC voice conversion using structured Gaussian mixture model in cepstrum eigenspace ES GMM
原文传递
An algorithm for voice conversion with limited corpus
8
作者 GU Dong JIAN Zhihua 《Chinese Journal of Acoustics》 CSCD 2018年第3期371-384,共14页
Under the condition of limited target speaker's corpus, this paper proposed an algorithm for voice conversion using unified tensor dictionary with limited corpus. Firstly, parallel speech of N speakers was selected r... Under the condition of limited target speaker's corpus, this paper proposed an algorithm for voice conversion using unified tensor dictionary with limited corpus. Firstly, parallel speech of N speakers was selected randomly from the speech corpus to build the base of tensor dictionary. And then, after the operation of multi-series dynamic time warping for those chosen speech, N two-dimension basic dictionaries can be generated which constituted the unified tensor dictionary. During the conversion stage, the two dictionaries of source and target speaker were established by linear combination of the N basic dictionaries using the two speakers' speech. The experimental results showed that when the number of the basic speaker was 14, our algorithm can obtain the compared performance of the traditional NMF- based method with few target speaker corpus, which greatly facilitate the application of voice conversion system. 展开更多
关键词 DTW An algorithm for voice conversion with limited corpus
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部