基于音素识别的语种辨识方法中的因子分析被引量：1

Factor Analysis for Language Identification Based on Phoneme Recognition

导出

摘要在基于音素识别的语种辨识系统中,特定的一段语音,音素识别的结果会受到说话人和信道等干扰因素的影响.对此,文中基于音素搭配关系对每段语音构建相应的特征向量表示.在向量空间中,利用因子分析建立噪声子空间的数学描述模型,并在语言模型的训练和识别过程加以消除.在NISTLRE2007的测试任务中,相对于基于音素识别的语种辨识基线系统,该方法可有效提高系统性能.在30s时长测试中,基于音素识别的语言模型和基于音素识别的支持向量机模型的等错误率分别相对降低14.4%和12.9%. In the phoneme recognition based language identification system,the key issue is whether the tokens or the token sequence can reflect the language related information or not.However,it is observed that for certain utterance,the noise in the output token sequence from the phone recognizer is introduced due to the channel,speaker and background clutters.To address this problem,each utterance is represented in n-gram vector.And in this vector space,the factor analysis is applied to model the noise subspace,which will be reduced in final modeling process.The experiment results on NIST LRE 2007 show that the proposed method can outperform the existing phone recognition based language identification system.In 30s evaluation task,the equal error rate（EER） of recognition reduces relatively about 14.4% against the baseline phone recognition followed by language modeling（PRLM） system,while about 12.9% against the baseline phone recognition followed by support vector machine（PRSVM） system.

作者仲海兵宋彦戴礼荣

机构地区中国科学技术大学电子工程与信息科学系科大讯飞语音实验室

出处《模式识别与人工智能》 EI CSCD 北大核心 2012年第1期105-110,共6页 Pattern Recognition and Artificial Intelligence

关键词自动语种识别因子分析音素识别器 Automatic Language Identification Factor Analysis Phone Recognizer

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献14

1Matejka P,Schwarz P,Cernocky J,et al.Phonotactic Language Identification Using High Quality Phoneme Recognition//Proc of the9th European Conference on Speech Communication and Technology.Lisbon,Portugal,2005:2237-2241.
2Povey D.Discriminative Training for Large Vocabulary Speech Recognition.Ph.D Dissertation.Cambridge,UK:Cambridge University,2004.
3Gauvain J L,Messaoudi A,Schewenk H.Language Recognition Using Phone Lattices//Proc of the8th International Conference on Spoken Language Processing.Jeju Island,Korea,2004:12831286.
4Shen Wade,Reynolds D.Improving Phonotactic Language Recognition with Acoustic Adaption//Proc of the8th Annual Conference ofthe International Speech Communication Association.Antwerp,Belgium,2007:358-361.
5Gales M J F.Maximum Likelihood Linear Transformations for HMMBased Speech Recognition.Computer Speech and Language,1998,12(2):75-98.
6Wegmann S,McAllester D,Orloff J,et al.Speaker Normalization on Conversational Telephone Speech//Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Atlanta,USA,1996:339-341.
7Matéjka P,Schwarz P,Hermansky H,et al.Phoneme Recognition Using Temporal Patterns//Proc of the6th International Conference on Text,Speech and Dialogue.Ceske Budejovice,Czech Republic,2003:198-205.
8Campbell W M,Campbell J R,Reynolds D A,et al.High-Level Speaker Verification with Support Vector Machines//Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Montreal,Canada,2004:73-76.
9Zissman M A.Comparison of Four Approaches to Automatic Language Identification of Telephone Speech.IEEE Trans on Speech and Audio Processing,1996,4(1):31-44.
10Campbell W M,Campbell J P.Support Vector Machines for Speaker and Language Recognition.Computer Speech and Language,2006,20(2/3):210-229.

二级参考文献11

1E.Singer,P.A.Torres-Carrasquillo,T.P.Gleason,W.M.Campbell,and D.A.Reynolds.Acoustic,Phonetic,and Discriminative approaches to Automatic Language Identification[C]//Proc.Eurospeech 2003,Sept.2003:1345-1348.
2P.A.Torres-Carrasquillo,E.Singer,M.A.Kohler,R.J.Greene,D.A.Reynolds,and J.R.Deller,Jr.Approaches to language identification using Gaussian mixture models and shifted delta cepstral features[C]//Proc.ICSLP,Colorado,USA:Sept.2002,89-92.
3Patrick Kenny,G.Boulianne,P.Ouellet and P.Dumouchel.Speaker and Session Variability in GMM-Based Speaker Verification[J].IEEE Transactions on Audio,Speech and Language Processing,May 2007,15(4):1448-1460.
4C.Vair,D.Cotibro,F.Castaldo,E.Dalmasso,and P.Laface.Channel factors compensation in model and feature domain for speaker recognition[C]//Proc.IEEE Odyssey,San Juan,PR:Jun.2006,CD-ROM.
5NIST 2007 LRE Plan[EB/OL],http://www.itl.nist.gov/iad/mig//tests/lang/2007.
6Gauvain,J.-L,Chin-Hui Lee.Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains[J].IEEE Transactions on Speech and Audio Processing,1994,2(2):291-298.
7P.Kenny,G.Boulianne,and P.Dumouchel.Eigenvoice modeling with sparse training data[J].IEEE Transactions on Speech Audio Processing,May 2005,13(3):p345-354.
8Callfriend corpus,telephone speech of 15 different languages or dialects[DB/OL],/www.ldc.upenn.edu/Catalog.
9LORI F.LAMEL,LAWRENCE R.RABINER.An Improved Endpoint Detector for Isolated W0rd Recognition[J].IEEE Transactions on Acoustics,Speech,and Signal Processing,Aug 1981,29(4):777-785.
10Douglas A.Reynolds,Thomas F.Quatieri and Robert B.Dunn.Speaker verification using adapted Gaussian mixture models[J].Digital Signal Processing,Jan.2000,10:19-41.

共引文献3

1吴奎,宋彦,戴礼荣.基于CUDA的GMM模型快速训练方法[J].数据采集与处理,2012,27(1):85-90. 被引量：3
2王晓艳,梁晋春,姚颖颖,马艳.广播电台数字音频资料库的自动标注及检索技术研究[J].计算机与现代化,2013(7):101-104. 被引量：1
3李卓茜,高镇,王化,刘俊南,朱光旭.短语音及易混淆语种识别改进系统[J].中文信息学报,2019,33(10):135-142. 被引量：2

同被引文献8

1李思一,戴蓓蒨,王海祥.基于子带GMM-UBM的广播语音多语种识别[J].数据采集与处理,2007,22(1):14-18. 被引量：2
2曾秀花,杨鉴,徐永华.语种辨识的多特征信息应用[J].计算机工程与应用,2010,46(25):146-148. 被引量：2
3李晓阳,伊.达瓦,吾守尔.斯拉木,勾坂芳典.基于GMM-UBM/SVM的维吾尔语电话语音监控系统[J].计算机应用与软件,2012,29(1):46-48. 被引量：2
4张丽,杨镇西,吉立新.语种识别算法中GSV计算的定点仿真与实现[J].计算机工程与设计,2012,33(2):679-683. 被引量：1
5黎林,朱军.基于小波分析与神经网络的语音端点检测研究[J].电子测量与仪器学报,2013,27(6):528-534. 被引量：26
6武光利.基于GMM的少数民族语自动语种识别系统设计[J].自动化与仪器仪表,2013(6):61-62. 被引量：5
7吴慧玲,杜成东,毛鹤.基于GMM的说话人识别算法的研究与应用[J].现代计算机（中旬刊）,2014(5):31-35. 被引量：6
8许辉,热依曼.吐尔逊,吾守尔.斯拉木.基于HMM和GMM的维吾尔语联机手写体识别研究[J].计算机工程与应用,2014,50(11):202-205. 被引量：4

引证文献1

1田昕,唐皓,余江,蔡光卉,肖文珂.GMM-UBM语种识别技术在无线电监管中的应用[J].电子测量技术,2015,38(8):82-84. 被引量：1

二级引证文献1

1王涛,王国中,朱林林.一种基于声纹识别的智能门锁系统设计与实现[J].电子测量技术,2019,42(3):107-111. 被引量：9

1王士进,孟猛,梁家恩,徐波.基于Multilingual的音素识别及其在语种识别中的应用[J].清华大学学报（自然科学版）,2008,48(S1):678-682. 被引量：2
2宋彦,戴礼荣,王仁华.基于超向量子空间分析的自动语种识别方法[J].模式识别与人工智能,2010,23(2):165-170. 被引量：4
3张凡,贺苏宁.模糊判决支持向量机在自动语种辨识中的研究[J].计算机工程与应用,2004,40(21):69-71.
4李颖,张有为.一种新型极低比特率声码器在音素HMM语音识别中的应用[J].五邑大学学报（自然科学版）,1999,13(4):37-41.
5张凡,贺苏宁.基于KNN-SVM的自动语种识别[J].电信技术研究,2004(3):15-19.
6宋原章,王仁华.汉语语音的聚类分段研究[J].自动化学报,1989,15(5):463-466.
7窦慧晶,王千龙,张雪.基于小波阈值去噪和共轭模糊函数的时频差联合估计算法[J].电子与信息学报,2016,38(5):1123-1128. 被引量：27
8薛少飞,宋彦,戴礼荣.基于多GPU的深层神经网络快速训练方法[J].清华大学学报（自然科学版）,2013,53(6):745-748. 被引量：4
9赵锋,Lou Martinage.SONET.IP.千兆以太网.DWDM:哪一个是佼佼者?[J].通讯世界,2000(10):35-37.
10罗万伯,罗霄岚,陈炜,彭舰,吴端培.K子空间和时延自相关器的英汉音素识别[J].电子科技大学学报,2006,35(1):66-69.

模式识别与人工智能

2012年第1期

浏览历史

内容加载中请稍等...

基于音素识别的语种辨识方法中的因子分析被引量：1

参考文献14

二级参考文献11

共引文献3

同被引文献8

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于音素识别的语种辨识方法中的因子分析 被引量：1

参考文献14

二级参考文献11

共引文献3

同被引文献8

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于音素识别的语种辨识方法中的因子分析被引量：1