摘要
主要针对文本提示型说话人识别中语音切分高精确度要求的问题,在利用Viterbi算法的语音切分基础上,提出了向后平滑搜索多帧能量极小值的语音切分方法。该算法首先对0—9的每个数字建立模型,然后利用Viterbi算法对随机数字串进行切分得到初始切分点,最后利用搜索多帧能量极小值的方法更新原始切分点。实验表明,相比于传统的切分算法,在误差范围小于20ms之内,改进算法的切分准确率由82.1%提高到88%。
An improved algorithm for speech segmentation is proposed to improve the segmentation accuracy in text-prompted speaker recognition. This method, based on Viterbi algorithm, implements speech segmentation by backward smooth searching of minimum frame energy. Firstly, the models for numbers from 0 to 9 are trained individually, then the segmentation points are acquired by using Viterbi algorithm to seg- ment a series of random numbers, and finally the segmentation points are updated by smooth searching of minimum frame energy. Experimental results show that this proposed algorithm could achieve an improvement of from 82.1% to 88% in segmentation accuracy within the error range of 20ms, as compared with the traditional algorithm.
出处
《通信技术》
2015年第9期1027-1031,共5页
Communications Technology
基金
中兴通讯产学研合作研究项目(No.CON1307160001)~~