子空间域相关特征变换与融合的语音识别方法被引量：4

A Continuous Speech Recognition Method Using Dependent Feature Transformation and Combination of Subspace Region

下载PDF

导出

摘要为了提高语音识别准确率,提出了一种子空间域相关特征变换与融合的语音识别方法(MFCC-BN-TC方法)。该方法提取语音短时谱结构特征(BN)和包络特征(MFCC)分别描述语音短时谱结构和包络信息,并采用域相关特征变换的形式分别对BN和MFCC特征进行特征变换;然后对这种变换进行泛化扩展提出子空间域相关特征变换,以采用不同的时间颗粒度(帧和语音分段)进行多层次区分性特征表达;最后,对多种区分性特征变换后的特征进行联合表征训练声学模型,并给出了区分性特征变换与融合的一般框架。实验结果表明:MFCC-BN-TC方法比采用原始BN特征方法和采用MFCC特征基线系统方法,识别性能各自提高了0.98%和1.62%;融合MFCCBN-TC方法变换以后的语音信号特征,相比于融合原始特征,识别率提升了1.5%。 A speech recognition method based on dependent feature transformation and combination of subspace regions（MFCC-BN-TC）is proposed to improve the recognition accuracy.The structure feature（BN）and envelope feature（MFCC）are extracted to separately describe the structure and envelope information of the short speech spectrum,and the region dependent feature transformation is adopted to perform feature transformation for the BN and the MFCC,respectively.The transformation is then generalized to give a subspace region-dependent feature transformation so that different time units（frame and segment）are applied to finish multi-level modeling.Moreover,a feature combination framework is proposed,and the acoustic model is trained using combined multi-features after transformation.Experimental results and comparisons with the method using raw BN and the method based on MFCC feature show that the recognition rate of the MFCC-BN-TC method increases by 0.96% and 1.62%,respectively.The gain in performance of the MFCC-BN-TC method increases by 1.5% through combining the transformed features.

作者陈斌胡平舸屈丹

机构地区解放军信息工程大学信息系统工程学院山东大学信息科学与工程学院

出处《西安交通大学学报》 EI CAS CSCD 北大核心 2016年第4期60-67,共8页 Journal of Xi'an Jiaotong University

基金国家自然科学基金资助项目(61175017 61403415)

关键词语音识别区分性训练深度神经网络子空间域相关特征变换 speech recognition discriminative training deep neural network subspace regiondependent feature transformation

分类号 TN912 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献17

1NASERSHARIF B, AKBARI A. SNR-dependent compression of enhanced Mel subband energies for compensation of noise effects on MFCC features [J]. Pattern Recognition Letters, 2011, 28 (11) : 1320- 1326.
2刘晓明,班超帆,冯晓荣.失真控制下的短时谱估计语音增强算法[J].西安交通大学学报,2011,45(8):78-84. 被引量：2
3POVEY D, KINGSBURY B, MANGU L, et al. fMPE: Discriminatively trained features for speech recognition [C]///Proceedings of the International Con- ference on Audio, Speech and Signal Processing. Pis- cataway, NJ, USA: IEEE, 2005: 961-964.
4ZHANG B, MATSOUKAS S, SCHWARTZ R. Re- cent progress on the discriminative region-dependent transform for speech feature extraction [C] // Proceed- ings of the Annual Conference of International Speech Communication Association. Baixs, France: ISCA, 2006: 1495-1498.
5YAN Z, HUO Q, XU J, et al. Tied-state based dis- criminative training of context-expanded region- dependent feature transforms for LVCSR [C]//Pro- ceedings of the International Conference on Audio, Speech and Signal Processing. Piscataway, N J, USA.. IEEE, 2013: 6940-6944.
6高莹莹,朱维彬.深层神经网络中间层可见化建模[J].自动化学报,2015,41(9):1627-1637. 被引量：16
7袁胜龙,郭武,戴礼荣.基于深层神经网络的藏语识别[J].模式识别与人工智能,2015,28(3):209-213. 被引量：14
8SAINATH T N, KINGSBURY B, RAMABHAD- RAN B. Auto-encoder bottleneck features using deep belief networks [C]// Proceedings of the International Conference on Audio, Speech and Signal Processing. Piseataway, NJ, USA: IEEE, 2012: 4153-4156.
9SAON G, KINGSBURY B. Discriminative feature- space transforms using deep neural networks [C]// Proceedings of the Annual Conference of International Speech Communication Association. Baixs, France: ISCA, 2012: 14-17.
10PAULIK M. Lattice-based training of bottleneck fea- ture extraction neural networks [ C] /// Proceedings of the Annual Conference of International Speech Com- munication Association. Baixs, France: ISCA, 2013: 89-93.

二级参考文献48

1卜凡亮,王为民,戴启军,陈砚圃.基于噪声被掩蔽概率的优化语音增强方法[J].电子与信息学报,2005,27(5):753-756. 被引量：16
2BOLL S F. Suppression of acoustic noise in speech using spectral subtraction [J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1979, 27 (2) : 113-120.
3EPHRAIM Y, MALAH D. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator [J]. IEEE Transactions on Acoustics,Speech and Signal Processing, 1984, 2(6) : 1109-1121.
4EPHRAIM Y, MALAH D. Speech enhancement using a minimum mean square error log spectral amplitude estimator [J]. IEEE Transactions on Acoustics, Speech, and Signal Processing,1985, 3(2): 443- 445.
5MA Jianfen, LOIZOU P C. SNR loss: a new objective measure for predicting the intelligibility of noise-suppressed speeeh[EB/OL]. (2009 12-02) [2010-10- 24]. http: /// dx. doi. org/10.1016/j, specorrL 2010. 10. 005.
6MA Jianfen, HU Yi, LOIZOU P C. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions [J].Aeoust Soc Amer, 2009, 125(5):3387-3405.
7LOIZOU P C, KIM P. Reasons why current speechenhancement algorithms do not improve speech intelligibility and suggested solutions [J]. IEEE Trans Au- dio Speech Lang Process, 2011, 19 (1):47-56.
8MARTIN R. Noise power spectral density estimation based on optimal smoothing and minimum statistics [J]. IEEE Trans on Acoustics, Speech, and Signal Processing, 2001, 9(5): 504-512.
9JOHSTOM J D. Transform coding of audio signals using perceptual noise criteria [J].IEEE Journal on Selected Areas in Communications, 1988, 6(2): 314- 323.
10PORTER J E, BOLL F S. Optimal estimators for spectral restoration of speech [C]//Proceedings of International Conference on Acoustics, Speech, and Signal Processing. Piscataway, NJ, USA: IEEE, 1984: 53-56.

共引文献29

1姚辉煌.氧化铅精矿压滤脱水的改进[J].世界有色金属,2000,25(4):42-43.
2李大中,赵杰,刘建屏,蔡文河,马延会.ITD改进信号子空间超声检测信号去噪[J].中国测试,2016,42(4):102-106. 被引量：3
3贺昱曜,李宝奇.一种组合型的深度学习模型学习率策略[J].自动化学报,2016,42(6):953-958. 被引量：28
4仲训杲,徐敏,仲训昱,彭侠夫.基于多模特征深度学习的机器人抓取判别方法[J].自动化学报,2016,42(7):1022-1029. 被引量：34
5张圣,郭武.采用通用语音属性建模的说话人确认[J].小型微型计算机系统,2016,37(11):2577-2581. 被引量：2
6陈莹,黄永彪,谢瑾.桥梁结构的未标记模态特征稀疏编码深度学习监测[J].计算机应用研究,2016,33(12):3725-3729. 被引量：1
7李康,王福利,何大阔,贾润达.基于数据的湿法冶金全流程操作量优化设定补偿方法[J].自动化学报,2017,43(6):1047-1055. 被引量：6
8林麒麟,包广清.基于MEA-Elman神经网络的电力日负荷预测[J].工业仪表与自动化装置,2017(3):7-10. 被引量：4
9郭晓洁,陈良,沈长青,刘承建.自适应深度卷积神经网络在人脸识别上的应用[J].自动化技术与应用,2017,36(7):72-77. 被引量：11
10罗建豪,吴建鑫.基于深度卷积特征的细粒度图像分类研究综述[J].自动化学报,2017,43(8):1306-1318. 被引量：142

同被引文献30

1王家耀,谢明霞,郭建忠,陈科.基于相似性保持和特征变换的高维数据聚类改进算法[J].测绘学报,2011,40(3):269-275. 被引量：8
2宋建华,税光泽.无线传感器网络的数据安全与隐私保护[J].微型机与应用,2013,32(3):4-6. 被引量：6
3谭晓衡,许可,秦基伟.基于听觉感知特性的语音质量客观评价方法[J].西南交通大学学报,2013,48(4):756-760. 被引量：6
4赵颖,樊晓平,周芳芳,汪飞,张加万.网络安全数据可视化综述[J].计算机辅助设计与图形学学报,2014,26(5):687-697. 被引量：64
5吕刚,郝平,盛建荣.一种改进的深度神经网络在小图像分类中的应用研究[J].计算机应用与软件,2014,31(4):182-184. 被引量：23
6杨思春,高超,戴新宇,尹存燕,陈家骏.基于差异性和重要性的问句特征组合[J].电子学报,2014,42(5):918-924. 被引量：7
7刘兰,林军,蔡君.面向大数据的异构网络安全监控及关联算法研究[J].电信科学,2014,30(7):84-89. 被引量：19
8吕国豪,罗四维,黄雅平,蒋欣兰.基于卷积神经网络的正则化方法[J].计算机研究与发展,2014,51(9):1891-1900. 被引量：74
9胡振,傅昆,张长水.基于深度学习的作曲家分类问题[J].计算机研究与发展,2014,51(9):1945-1954. 被引量：21
10杨钊,陶大鹏,张树业,金连文.大数据下的基于深度神经网的相似汉字识别[J].通信学报,2014,35(9):184-189. 被引量：28

引证文献4

1李山.智能家具语音识别精准度优化仿真[J].计算机仿真,2018,35(11):281-284. 被引量：4
2肖弋.一种新的特征变换算法在网络数据安全检查中应用研究[J].科技通报,2019,35(5):127-131. 被引量：4
3史晓琴,王晓媛.云平台网络数字化信息自适应识别仿真[J].计算机仿真,2019,36(12):387-390. 被引量：2
4赵志宇,贺学剑.融合LPC和MFCC特征的前馈神经网络短语音识别[J].长江信息通信,2023,36(11):171-174.

二级引证文献10

1李山.智能家居的现状和发展趋势研究[J].工业设计,2019,0(4):152-153. 被引量：8
2陈颖,汪功明,杨磊,辛礼兵.语音识别技术在智能家居的应用技术浅析[J].科学技术创新,2019(31):60-61. 被引量：7
3熊先青,李荣荣,白洪涛.中国智能家具产业现状与发展趋势[J].林业工程学报,2021,6(1):21-28. 被引量：58
4魏巍.基于大数据分析的高校创业网络信息平台设计[J].信息与电脑,2021,33(16):126-128.
5王喆.基于K-means聚类算法的章程文本数据安全智能检验分析系统设计[J].自动化与仪器仪表,2022(3):96-100. 被引量：5
6符士侃,夏元轶,杜钰,石廷川.基于概率主题模型的大数据平台隐私泄露自动检测方法[J].自动化与仪器仪表,2022(4):115-118. 被引量：1
7赵瑞杰.基于神经网络的信息提取方法研究[J].信息与电脑,2023,35(2):48-50.
8张远亮.基于多维行为分析的窃电高风险客户精准定位方法[J].广西科学院学报,2023,39(2):199-205.
9马卓斌,李鑫,金冰鑫.基于关联规则算法的无线通信网络数据安全分类方法[J].自动化技术与应用,2024,43(2):119-122. 被引量：1
10潘尧,熊先青.基于用户需求的老年人智能座椅设计研究[J].家具,2024,45(2):38-43.

1阎福智.语音信号处理中特征提取方法研究[J].中国新通信,2013,15(21):127-128. 被引量：1
2刘爱霞,赵国庆.雷达信号包络特征的检测与分析[J].电子对抗技术,2002,17(3):12-16. 被引量：11
3徐启建.电信领域IP网络的总体框架[J].中兴通讯技术,2000,6(4):4-9. 被引量：1
4高明亮,杨晓敏,余艳梅,罗代升.Photometric invariant feature descriptor based on SIFT[J].Chinese Optics Letters,2012,10(B06):63-68.
5余慧娟,刘相新,孙颖力,韦学中.基于角反射器的车辆雷达特征变换系统研究[J].航天电子对抗,2015,31(4):61-64.
6谭春林,汪洪桥,裴得利.SWF-SIFT Approach for Infrared Face Recognition[J].Tsinghua Science and Technology,2010,15(3):357-362. 被引量：1
7叶帼华.小波变换及其在水印技术中的应用[J].福建电脑,2009,25(5):13-13.
8刘爱霞,赵国庆.一种新的雷达信号识别方法[J].航天电子对抗,2003,32(1):14-16. 被引量：12
9陈斌,张连海,屈丹,李弼程.正则化分段区分性特征变换方法[J].西安电子科技大学学报,2016,43(2):102-107.
10王彪.一种基于小波包的语音信号特征提取方法研究[J].信息技术,2012,36(6):158-160. 被引量：1

西安交通大学学报

2016年第4期

浏览历史

内容加载中请稍等...

子空间域相关特征变换与融合的语音识别方法被引量：4

参考文献17

二级参考文献48

共引文献29

同被引文献30

引证文献4

二级引证文献10

相关作者

相关机构

相关主题

浏览历史

子空间域相关特征变换与融合的语音识别方法 被引量：4

参考文献17

二级参考文献48

共引文献29

同被引文献30

引证文献4

二级引证文献10

相关作者

相关机构

相关主题

浏览历史

子空间域相关特征变换与融合的语音识别方法被引量：4