In order to improve the performance of voice conversion, the fundamental frequency (F0) transformation methods are investigated, and an efficient F0 transformation algorithm is proposed. First, unlike the traditiona...In order to improve the performance of voice conversion, the fundamental frequency (F0) transformation methods are investigated, and an efficient F0 transformation algorithm is proposed. First, unlike the traditional linear transformation methods, the relationships between F0s and spectral parameters are explored. In each component of the Gaussian mixture model (GMM), the F0s are predicted from the converted spectral parameters using the support vector regression (SVR) method. Then, in order to reduce the over- smoothing caused by the statistical average of the GMM, a mixed transformation method combining SVR with the traditional mean-variance linear (MVL) conversion is presented. Meanwhile, the adaptive median filter, prevalent in image processing, is adopted to solve the discontinuity problem caused by the frame-wise transformation. Objective and subjective experiments are carried out to evaluate the performance of the proposed method. The results demonstrate that the proposed method outperforms the traditional F0 transformation methods in terms of the similarity and the quality.展开更多
基金The National Natural Science Foundation of China(No. 60975017)the Natural Science Foundation of Guangdong Province (No. 10252800001000001)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province (No. 10KJB510005)
文摘In order to improve the performance of voice conversion, the fundamental frequency (F0) transformation methods are investigated, and an efficient F0 transformation algorithm is proposed. First, unlike the traditional linear transformation methods, the relationships between F0s and spectral parameters are explored. In each component of the Gaussian mixture model (GMM), the F0s are predicted from the converted spectral parameters using the support vector regression (SVR) method. Then, in order to reduce the over- smoothing caused by the statistical average of the GMM, a mixed transformation method combining SVR with the traditional mean-variance linear (MVL) conversion is presented. Meanwhile, the adaptive median filter, prevalent in image processing, is adopted to solve the discontinuity problem caused by the frame-wise transformation. Objective and subjective experiments are carried out to evaluate the performance of the proposed method. The results demonstrate that the proposed method outperforms the traditional F0 transformation methods in terms of the similarity and the quality.