变帧长和变帧率在说话人确认中的应用

Application of variable frame length and frame rate in speaker verification system

下载PDF

导出

摘要从变帧长、变帧率角度考虑提出一种新的提取MFCC的方法。该方法先将帧长和帧率都限制为基音周期的整数倍,即基音同步算法;然后基于变帧率算法的原理在语音特征变化缓慢的地方去除一些帧来降低帧率。在NIST 99说话人评测上进行的说话人确认实验表明,该方法不但提升了系统性能,而且降低了帧率,节省了特征文件的存储空间。 A new method for extracting Mel-Frequency Ceptral Coefficients （MFCC） was proposed from the perspective of variable frame length and frame rate. The proposed method restricted the frame length and frame shift to multiples of pitch period, called pitch synchronous algorithm; then removed some frames where the acoustic feature changed slowly to decrease the frame rate according to the principle of variable frame rate algorithm. With speaker verification experiments on NIST 99 speaker recognition evaluation, the new approach not only improves the system performance but also decreases frame rate, which means saving the storage space of feature files.

作者王明肖熙

机构地区清华大学电子工程系

出处《计算机应用》 CSCD 北大核心 2007年第8期2051-2052,2076,共3页 journal of Computer Applications

关键词说话人确认基音同步变帧率算法 speaker verification pitch synchronization variable frame rate algorithm

分类号 TP873 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献10

1REYNOLDS D,QUATIERI T,DUNN R.Speaker verification using adapted mixture models[J].Digital Signal Processing,2002,10(1-3):181-202.
2QUATIERI T,DUNN B,REYNOLDS D.On the influence of rate,pitch,and spectrum on automatic speaker recognition performance[C]// ICSLP.Beijing:[s.n.],2002:491-494.
3ZILCA R,NAVRATIL J,RAMASWAMY G.Depitch and the role of fundamental frequency in speaker recognition[C]// ICASSP'03,Hong Kong.[S.l.]:IEEE Press,2003,2:81-84.
4ZILCA R,NAVRATIL J,RAMASWAMY G.Syncpitch:A pseudo pitch synchronous algorithm for speaker recognition[C/OL]// EUROSPEECH,Geneva,Switzerland,2003[2007-01-15].http://www.research.ibm.com/CBG/papers/eurospeech03_syncpitch.pdf.
5KIM S,ERIKSSON T,KANG H-G,et al.A pitch synchronous feature extraction method for speaker recognition[C]// ICASSP'04,Montreal,Canada.[S.l.]:IEEE Press,2004,1:405-408.
6SECREST B,DODDINGTON G.An integrated pitch tracking algorithm for speech systems[C]// ICASSP'83,Boston,Massachusetts.[S.l.]:IEEE Press,1983,8:1352-1355.
7ZHU Q F,ALWAN A.On the use of variable frame rate analysis in speech recognition[C]// ICASSP'00,Istanbul,Turkey.[S.l.]:IEEE Press,2000:1783-1786.
8YOU H,ZHU Q F,ALWAN A.Entropy-based variable frame rate analysis of speech signals and its application to ASR[C]// ICASSP'04,Montreal,Canada.[S.l.]:IEEE Press,2004:549-552.
9MARTIN A,PRZYBOCKI M.The NIST 1999 speaker recognition evaluation an overview[J].Digital Signal Processing,2000,10(1):1-18.
10MARTIN A,DODDINGTON G,KAMM G,et al.The DET curve in assessment of detection task performance[C]// Proceedings of 5th European Conference on Speech Communication and Technology (Eurospeech'97).Rhodes,Greece:[s.n.],1997:1895-1898.

1陈迪,龚卫国,李波.噪声鲁棒性说话人识别语音高频加权MFCC提取[J].仪器仪表学报,2008,29(3):668-672. 被引量：15
2孙凤军,徐孝天.相机成像中一种低噪点的宽动态范围算法[J].电子技术与软件工程,2015(18):83-85.
3杨江,陆源,李治.仿真模型预测能力的度量方法研究[J].仪器仪表学报,2002,23(z2):465-466.
4陈楠,贺前华,王伟凝,陈荣研.基音同步帧长特征在英语词重音检测中的应用[J].计算机应用,2008,28(6):1533-1536. 被引量：4
5邵艳秋,韩纪庆,王东东,刘挺.基于基音同步的时频域插值的汉语语音合成[J].哈尔滨工业大学学报,2007,39(1):110-113.
6陈楠,贺前华.非线性加权能量特征在英语词重音检测中的应用[J].声学学报,2008,33(6):520-525. 被引量：3
7林龙新,刘小丽,全渝娟,林伟伟.基于对象的视频摘要算法的实现与加速[J].华南理工大学学报（自然科学版）,2015,43(5):92-99. 被引量：1
8孔敏.基于基音同步叠加技术的韵律修正的实现[J].安庆师范学院学报（自然科学版）,1999,5(1):41-44.
9王辅中,戴琼海,丁嵘.视频转码中的运动重估计技术[J].有线电视技术,2004,11(18):24-28. 被引量：2
10黄兵.对抗撕裂AMD FREESYNC 显示器首测[J].微型计算机,2015,0(13):70-74.

计算机应用

2007年第8期

浏览历史

内容加载中请稍等...

变帧长和变帧率在说话人确认中的应用

参考文献10

相关作者

相关机构

相关主题

浏览历史