一种基于码书映射的高效语音转换方法被引量：2

A Highly Efficient Voice Conversion Method Based on Codebook Mapping

下载PDF

导出

摘要为了使机器人在人-机语音交互过程中更为自然,利用语音转换技术改变源语音个性特征(机械音),进而变化为自然的目标人语音,是一种可行的方案。然而,当前的语音转换主流方法在实时性要求高且内核小的嵌入式机器人中并不适用。引入一种高效的改进型码书转换方法。该方法首先通过匹配线性谱频率参数的相对距离来求取加权系数,进而实现码字的预测重构;其次,对预测的码字进行带宽修正,克服频谱偏移问题。实验结果表明:该方法相比较传统方法,在转换性能相当的条件下,运行时间缩短75%左右。 In human -robot interaction, it is desired to have synthetic voices which sound natural and can be personalized for each user. One solution is to use voice conversion, in which the characteris- tics of a source mechanical voice are changed to produce a sound corresponding to a given target natural voice. However, the popular voice conversion method is computationally intensive, and not suitable for application in a robot with small kernel embedded. This paper introduces a high efficient improved segmental codebook conversion method. It firstly calculates the weighting coefficient by matching the relative distance of the Line Spectral Frequency （LSF） parameters to realize the prediction refactoring of code word. Secondly, the bandwidth correction for the predicted code word is used to solve the problem of spectrum shift. The test results show that the method is approximately 75% faster than the traditional Gaussian Mixture Model（GMM） under the comparative conversion performance.

作者王志卫徐宁刘小峰

机构地区河海大学物联网工程学院河海大学-法国Alderbaran Robotics认知与机器人实验室常州市机器人与智能技术重点实验室教育部宽带无线通信与网络感知技术重点实验室

出处《微处理机》 2014年第1期65-69,共5页 Microprocessors

基金国家自然科学基金(60905060) 中央高校基础研究项目(2011B11114 2012B07314 2012B04014) 教育部重点实验室开放基金(NYKL201305)

关键词语音转换嵌入式系统谐波随机模型分段码书人机交互 Voice Conversion Embedded Systems Harmonic Stochastic Model SegmentalCodebook Man - machine Interaction

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献13

1Wu C H,Hsia C C,Liu T H. Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis[J].IEEE Transactions on Audio Speech and Language Processing,2006,(04):1109-1116.
2Zuo G,Liu W. Genetic algorithm based RBF neural network for voice conversion[A].IEEE,2004.4215-4218.
3Desai S,Raghavendra E V,Yegnanarayana B. Voice conversion using artificial neural networks[A].2009.3893-3896.
4Stylianou Y,CappéO,Moulines E. Continuous probabilistic transform for voice conversion[J].IEEE Transactions on Speech and Audio Processing,1998,(02):131-142.
5Kain A B. High resolution voice transformation[D].Rockford College,2001.
6Stylianou Y,Cappe O. A system for voice conversion based on probabilistic classification and a harmonic plus noise model[A].1998.281-284.
7Arslan L M. Speaker transformation algorithm using segmental codebooks (STASC)[J].SPEECH COMMUNICATION,1999,(03):211-226.
8Abe M,Nakamura S,Shikano K. Voice conversion through vector quantization[A].1988.655-658.
9Erro D,Moreno A,Bonafonte A. Flexible harmonic/stochastic speech synthesis[A].2007.
10Zhi-Hua J,Zhen Y. Voice conversion using Viterbi algorithm based on Gaussian mixture model[A].2007.32-35.

同被引文献11

1Stylianou Y, Cappe O. A system for voice conversion based on probabilistic classification and a harmonic plus noise model [ C ]. IEEE International Conference on Acoustics, Speech and Signal Processing, 1998:281 - 284.
2Wu C H, Hsia C C, Liu T H, et al. Voice conversion using duration- embedded bi -HMMs for expressive speech synthesis [ J . IEEE Transactions on Audio, Speech, and Language Processing,2006,14 ( 4 ) : 1109 - 1116.
3Stylianou Y, Capp6 O, Moulines E. Continuous probabilis- tic transform for voice conversion [ J ]. IEEE Transactions on Speech and Audio Processing, 1998,6(2) : 131 - 142.
4Kain A B. High resolution voice transformation [ D ]. Rockford College,2001.
5Zuo G, Liu W. Genetic algorithm based RBF neural network for voice conversion [ C ]. Intelligent Control and Automation, 2004. Fifth World Congress on. IEEE, 2004,5:4215 - 4218.
6Desai S, Raghavendra E V,Yegnanarayana B, et al. Voice conversion using artificial neural networks [ C 1. IEEE International Conference on Acoustics, Speech and Signal Processing, 2009 : 3893 - 3896.
7Erro D, Moreno A, Bonafonte A. Flexible harmonic/ stochastic speech synthesis [ C ]. 6th ISCA Workshop on Speech Synthesis, 2007.
8Toda T, Black A W, Tokuda K. Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter [ C ].//ICASSP 2005 (1):9-12.
9Ye H, Young S. High quality voice morphing [ C ]./// Acoustics, Speech, and Signal Processing, 2004. Proceedings. ( ICASSP 04 ). IEEE International Confer- ence on. IEEE,2004( 1 ) :1-9- 12.
10Turk O, Arslan L M. Robust processing techniques for voice conversion [ J ]. Computer Speech & Language, 2006,20(4) :441 -467.

引证文献2

1胡芳,徐宁,李海燕.基于码书映射的语音转换算法改进[J].微处理机,2015,36(2):35-38. 被引量：1
2曾歆,张雄伟,孙蒙,苗晓孔,姚琨.基于GMM模型和LPC-MFCC联合特征的声道谱转换研究[J].声学技术,2020,39(4):451-455. 被引量：8

二级引证文献9

1潘梦鹞,吕小勇,陈少伟,郇锐铁,王锋.基于AI智能语音技术线上教学的创新与实践[J].创新创业理论研究与实践,2022(24):170-173. 被引量：2
2陈华光.AutoCAD的动画制作[J].电脑编程技巧与维护,2000(4):86-89. 被引量：1
3张雄伟,苗晓孔,曾歆,孙蒙,曹铁勇.语音转换技术研究现状及展望[J].数据采集与处理,2019,34(5):753-770. 被引量：9
4屈晓宜.基于视频分析技术的轨道交通车站安全预警模型构建及仿真[J].自动化与仪器仪表,2021(3):4-8. 被引量：3
5罗春梅,张风雷.基于均值特征和改进深度神经网络的说话人识别算法[J].声学技术,2021,40(4):503-507. 被引量：2
6陈晓红,滕华.基于深度机器学习的英语语音识别研究[J].贵阳学院学报（自然科学版）,2021,16(3):1-4. 被引量：3
7王学光,诸珺文,张爱新.基于ARIMA预测MFCC特征的声纹同一性鉴定方法[J].计算机科学,2022,49(5):92-97. 被引量：9
8李伟,曾繁洋,王博,陈忠斌.基于MFCC加权动态特征组合的声纹识别技术在地下电缆防护的应用[J].电力信息与通信技术,2022,20(5):16-22. 被引量：2
9王兴林.基于MFCC的空中交通管制语音指令的特征提取研究[J].电声技术,2023,47(6):68-72.

1胡芳,徐宁,李海燕.基于码书映射的语音转换算法改进[J].微处理机,2015,36(2):35-38. 被引量：1
2李海燕,王程程,徐宁,胡芳.基于混合码书映射的高效语音转换方法[J].数据采集与处理,2016,31(3):512-524. 被引量：2
3徐培民.改进的FFT线性谱分析方法中窗函数的选用[J].抚顺石油学院学报,1997,17(1):41-45. 被引量：6
4NI拓宽智能相机系列产品选择[J].中国仪器仪表,2008(10):27-27.
5NI拓宽智能相机系列产品选择[J].国外电子测量技术,2008,27(10):81-81.
6籍顺心.汉语人机语音通信基础[J].声学学报,2010,35(4).
7徐培民.提高FFT线性谱精度的一种方法[J].抚顺石油学院学报,1996,16(4):81-84. 被引量：3
8无线通信设备[J].个人电脑,2003,9(8):141-141.
9潘宁,邓燕妮.基于S3C44B0X的嵌入式机器视觉系统设计[J].机械与电子,2006,24(10):52-54. 被引量：2
10苗新法,范春晓.依赖OSYNO6188的SMS TTS系统的实现[J].电子技术（上海）,2005,32(10):68-70.

微处理机

2014年第1期

浏览历史

内容加载中请稍等...

一种基于码书映射的高效语音转换方法被引量：2

参考文献13

同被引文献11

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

一种基于码书映射的高效语音转换方法 被引量：2

参考文献13

同被引文献11

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

一种基于码书映射的高效语音转换方法被引量：2