音素层特征超矢量的说话人识别性能及优化

Study on performance and improvement of speaker recognition of phoneme feature supper vector

下载PDF

导出

摘要音素层特征等高层信息的参数由于完全不受信道的影响,被认为可对基于声学参数的低层信息系统进行有益的补充,但高层信息存在数据稀少的缺点。建立了基于音素特征超矢量的识别方法,并采用BUT的音素层语音识别器对其识别性能进行分析,进而尝试通过数据裁剪和KPCA映射的方法来提升该识别方法的性能。结果表明,采用裁剪并不能有效提升其识别性能,但融合KPCA映射的识别算法的性能得到了显著提升。进一步与主流的GMM-UBM系统融合后,相对于GMM-UBM系统,EER从8.4%降至6.7%。 As being hard to be influenced by the channel situation,the higher level information,such as phoneme feature,is recognized to be a good complementarity to the current speaker recognition technology based on lower level information,such as acoustic information.However,the higher level speech information has their inherent limitations of data sparsity.Based on the BUT speaker recognizer platform,the performance of the speaker recognition method based on phoneme feature super vector is analyzed and evaluated.The method of data pruning and Kernel Principal Component Analysis（KPCA） are introduced to improve its recognition performance.Results show that the recognition performance is not effectively improved by the data pruning method,but is greatly enhanced when the KPCA is used.Furthermore,when the current system is integrated with GMM-UBM（Gaussian Mixture Model-Universal Background Model） system,the EER（Equal Error Rate） of the GMM-UBM system can be lowered down from 8.4% to 6.7%.

作者姚红谭敏郭武

机构地区合肥学院电子信息与电气工程系中国科学技术大学电子工程与信息科学系

出处《计算机工程与应用》 CSCD 北大核心 2011年第26期140-142,共3页 Computer Engineering and Applications

基金国家自然科学基金No.60970161 中央高校基本科研业务费专项资金项目安徽省高校优秀青年人才基金~~

关键词音素层特征说话人识别核函数主元分析数据裁剪 phoneme feature speaker recognition Kernel Principal Component Analysis（KPCA） data pruning

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献12

1董志峰,汪增福.基于动态MFCC的说话人识别算法[J].模式识别与人工智能,2005,18(5):596-601. 被引量：7
2Reynolds D A, Quatieri T F,Duma R B.Speaker verification using adapted Gaussian mixture models[J].Digital Signal Processing, 2000,10(3) : 19-41.
3鲍焕军,郑方.GMM-UBM和SVM说话人辨认系统及融合的分析[J].清华大学学报（自然科学版）,2008,48(S1):693-698. 被引量：9
4Reynolds D A,Andrews W,Campbell J.The SuperSID project: exploiting high-level information for high-accuracy speaker rec- ognition[C]//Proeedings of ICASSP Conference, 2003 : 784-787.
5Nello C,Jhon S T.Support vector machines[M].Cambridge U K: Cambridge University Press,2000.
6姚红,梁栋,郭武.基于模型距离和支持向量机的说话人确认[J].计算机仿真,2009,26(3):343-346. 被引量：2
7李邵梅,郭云飞,卫红权.基于分布特征统计的说话人识别[J].计算机工程与应用,2009,45(34):118-120. 被引量：2
8Campbell W M, Campbell J P, Gleason T P, et al.Speaker verification using support vector machines and high-level features[J]. IEEE Trans, Audio, Speech, Lang, 2007, 15 (7) :2085-2094.
9BUT.Phoneme recognizer based on long temporal context[EB/OL]. [2010-02].http://speech.fit.vutbr.cz/software.
10Collobert R.SVMTorch:a support vector machine for large-scale regression and classification problems[EB/OL].[2010-02].http:// bengio.abracadoudou.com/projects/SVMTorch.html.

二级参考文献30

1张庆芳,赵鹤鸣.基于改进VQ算法的文本无关的说话人识别[J].计算机工程与应用,2006,42(10):65-68. 被引量：7
2Douglas A Reynolds, Thomas F Quatieri and Robert B Dunn. Speaker verification using adapted Gaussian mixture models [ J ]. Digital Signal Processing, Academic Press, 2000, 10 : 19 - 41.
3W M Campbell, J P Campbell, D A Reynolds. Support vector machines for speaker and language recognition[ J]. Computer Speech and Language, 2006, 20:210 - 229.
4W M Campbell, D E Sturim, D A Reynolds. Support Vector Machines Using GMM Supervectors for Speaker Verification [ J ]. IEEE Signal Processing Letters, 2006, 13(5) :308 -311.
5H Beigi, S Maes and J Sorensen. A Distance Measure Between Collections Of Distributions and Its Application to Speaker Recognition [ C ]. In Proc. ICASSP, 1998, 2:753 - 756.
6Jacob Goldberger, Shift Gordon, Hayit Greenspan. An Efficient Image Similarity Measure Based on approximations of KL - Divergence Between Two Gaussian Mixtures [ C ]. In Proc. ICCV03, 2003. 487 - 493.
7Bing Xiang, Upendra V Chaudhari, Jir'l Navr'atil. Short -time Gaussianization for robust speaker verification [ C ]. In Proc. ICASSP02, 2002, 1:681 - 684.
8Young K H, et al. Pitch Detection with Average Magnitude Difference Function Using Adaptive Threshold Algorithm for Estimating Shimmer and Jitter. In: Proc of the 20th IEEE International Annual Conference on Engineering in Medicine and Biology Society. Hong Kong, China, 1998, Ⅵ:3162-3164.
9Wang Y R, Wong I J, Tsao T C. A Statistical Pitch Detection Algorithm. In.. Proc of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Orlando, USA,2002,Ⅰ:13--17.
10Hung W W, Wang H C. On the Use of Weighted Filter Bank Analysis for the Derivation of Robust MFCCs. IEEE Signal Processing Letters, 2001, 8(3):70--73.

共引文献16

1张燕,唐振民,李燕萍.基于单字音特征提取的说话人识别方法[J].计算机工程,2009,35(10):188-189. 被引量：1
2卿湘运,王行愚.鲁棒贝叶斯混合分布的模型选择[J].南京大学学报（自然科学版）,2009,45(5):689-698. 被引量：1
3曹洁,潘鹏.基于GMM的说话人识别技术研究[J].计算机工程与应用,2011,47(11):114-117. 被引量：6
4展领,景新幸.基于VQ-MAP和SVM融合的说话人识别系统[J].计算机工程与应用,2011,47(13):136-138. 被引量：5
5屠彬彬,于凤芹.基于样本熵与MFCC融合的语音情感识别[J].计算机工程,2012,38(7):142-144. 被引量：7
6杨迪,戚银城,刘明军,张华芳子,武军娜.说话人识别综述[J].电子科技,2012,25(6):162-165. 被引量：5
7李鉴,李杰.基于临界小波参数和新序列核支持向量机的说话人识别[J].信阳师范学院学报（自然科学版）,2012,25(3):398-401. 被引量：1
8丁聪敏,唐建,郭立.基于WCCN和余弦评分的话者确认研究[J].中国科学技术大学学报,2012,42(10):813-819.
9邢玉娟,曹晓丽,谭萍,李恒杰.基于WLDA和i-稀疏表示分类的说话人确认[J].计算机工程与应用,2016,52(13):173-176.
10罗元,孙龙.一种新的鲁棒声纹特征提取与融合方法[J].计算机科学,2016,43(8):297-299. 被引量：1

1李钢.ASON与传统SDH设备联合组网的思考[J].现代电信科技,2005(9):63-65. 被引量：1
2尹晓桂.入侵检测在计算机安全防护中的应用[J].计算机光盘软件与应用,2010(4):61-62.
3王心怡,李志农,张新广,袁振伟.基于稀化核函数主元分析的机械故障诊断方法研究[J].机械强度,2009,31(5):751-754.
4李志农,王心怡,张新广,袁振伟.基于全矢谱核函数主元分析的旋转机械故障诊断方法研究[J].振动与冲击,2008,27(7):55-57. 被引量：3
5HD-SDI,挑战抑或补充?[J].智能建筑与城市信息,2012(5):82-87.
6邵央,冯哲,李宗葛.HMM算法框架在银行语音服务中的实现[J].计算机工程,2000,26(11):126-128. 被引量：4
7蒋丽影,惠晓威,王丹.FSO——校园网络安全可靠的大容量桥接[J].中国科技成果,2006(11):41-42.
8许利群,陈永彬.语音识别技术及其发展[J].电信科学,1988,4(12):27-34.
9刘志刚,贺前华,李韬.基于OpenRISC1200的语音识别SoC设计[J].电子工程师,2005,31(2):27-29. 被引量：1
10马向辰.电信和移动的Wi-Fi“先发优势”并不明显[J].通信世界,2010(40):23-23.

计算机工程与应用

2011年第26期

浏览历史

内容加载中请稍等...

音素层特征超矢量的说话人识别性能及优化

参考文献12

二级参考文献30

共引文献16

相关作者

相关机构

相关主题

浏览历史