摘要
为了给说话人识别系统的应用提供一个较为重要的技术途径,利用美国TI公司生产的TMS320VC5402DSP作为CPU开发的DSP(D igital S ignal Processor)系统,实时实现了一个基于说话人自适应的开集说话人识别系统。为了提高系统的处理速度和识别的准确性,系统采用少量的语音数据产生说话人模型,在改进的矢量量化方法的基础上,利用一种说话人自适应的阈值处理算法,有效地提高了系统的识别率。同时对降低算法的计算量、数据的存储量进行了较深入的研究。从说话人识别的响应时间、训练时间等综合方面考虑,使真正意义上的说话人识别系统在DSP芯片上实现成为可能。实验表明,该系统在普通机房条件下,可以取得较好的实验效果,系统识别时间小于1 s,完全满足实时性的要求。
In order to provide an important method for the practical applications of a speaker-recognition system, this paper presents an open-set speaker-recognition Real-time system based on speaker adaptive dynamic threshold, which has realized with TMS320VC5402 digital signal processor. In order to improve the processing speed and the recognition precision, it uses the little speech data to get the speaker's voice model, and based on the revised vector quantization algorithm, it presents a dynamic threshold method, which can improve the recognition accuracy greatly. At the same time, the research of the decreasing the amount of operation and storage has been conducted thoroughly. On the consideration of some factors, such as the respond time and train time of the system, it is possible to realize a real speaker recognition system by Digital Signal Processor. Experiment results show that the recognition rate of this system is satisfied, and the recognition time of the system is less than 1 second, which can meet the requirement of real-time system.
出处
《吉林大学学报(信息科学版)》
CAS
2006年第3期252-258,共7页
Journal of Jilin University(Information Science Edition)
基金
长春市科技计划基金资助项目(05GG18)
关键词
说话人识别
开集
说话人自适应阈值
MEL倒谱系数
数字信号处理器
speaker recognition system
open-set
speaker adaptive dynamic threshold
Mel-frequency cepstral coefficients
digital signal processor