摘要
由于语音合成的便利性,合成伪装语音对说话人认证系统的安全构成了很大的威胁。为了进一步提升说话人认证系统的伪装语音检测能力,提出了一种利用语谱图频域信息的合成语音检测方法,它通过局部相位量化算法对语谱图频域信息进行描述。首先,将语谱图分为若干子块,然后对每个子块进行局部相位量化,经直方图统计分析后获得局部相位量化特征向量并将该特征向量作为随机森林分类器的输入特征,实现合成语音检测。实验结果表明,该方法进一步降低了合成语音检测系统的串联检测代价数值,并且具有更强的泛化能力。
Due to the convenience of speech synthesis,synthesized disguised speech poses a great threat to the secu-rity of speaker verification systems.In order to further enhance the ability of detecting the camouflage to the speaker verification system,a method of synthetic speech detection was put forward using the information in spectral domain of the synthetic speech spectrogram.The method employed the local phase quantization(LPQ)algorithm to describe frequency domain information in the speech spectrogram.Firstly,the spectrogram was divided into several sub-blocks,and then the LPQ was performed on each sub-block.After the histogram statistical analysis,the LPQ feature vector was obtained and used as the input feature of the random forest classifier to realize the synthetic speech detection.The experimental results demonstrate that the proposed method further reduces tandem detection cost func-tion(t-DCF)and has better generalization ability.
作者
徐嘉
简志华
金宏辉
杨曼
XU Jia;JIAN Zhihua;JIN Honghui;YANG Man(School of Communication Engineering,Hangzhou Dianzi University,Hangzhou 310018,China;Key Laboratory of Data Storage and Transmission Technology of Zhejiang Province,Hangzhou 310018,China)
出处
《电信科学》
北大核心
2024年第2期63-71,共9页
Telecommunications Science
基金
国家自然科学基金资助项目(No.61201301,No.61772166)。