Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode

下载PDF

导出

摘要 We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning is fast. Compared withConvolutional Neural Network, it has a simpler and understood structure and lessparameters to learn. Experimental results show that the advantage of hybridLRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classificationarchitecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN ishelpful to differentiate among multiple language speech sets.

作者 Yue Zhao Jianjian Yue Wei Song Xiaona Xu Xiali Li Licheng Wu Qiang Ji

机构地区 School of Information and Engineering Rensselaer Polytechnic Institute

出处《Journal on Internet of Things》 2019年第1期17-23,共7页

关键词 Multi-dialect speech recognition Tibetan language latent regressionbayesian network end-to-end model

分类号 TN9 [电子电信—信息与通信工程]

引文网络
相关文献

参考文献3

1李冠宇,孟猛.藏语拉萨话大词表连续语音识别声学模型研究[J].计算机工程,2012,38(5):189-191. 被引量：16
2袁胜龙,郭武,戴礼荣.基于深层神经网络的藏语识别[J].模式识别与人工智能,2015,28(3):209-213. 被引量：14
3王庆楠,郭武,解传栋.基于端到端技术的藏语语音识别[J].模式识别与人工智能,2017,30(4):359-364. 被引量：8

二级参考文献19

1李永宏,孔江平,于洪志.藏语文-音自动规则转换及其实现[J].清华大学学报（自然科学版）,2008,48(S1):621-626. 被引量：19
2共确降措.论藏文[J].西藏研究,1997(3):94-108. 被引量：7
3郑方,吴文虎,方棣棠.连续无限制语音流中关键词识别的研究现状[C].第四届全国人机语音通讯学术会议论文集,1996.
4Steve Y.The HTK Book(for HTK Version 3.4)[D].Cambridge,UK:Engineering Department of Cambridge University,2009.
5Rabiner L,Juang Biing-Hwang.Fundamentals of Speech Recognition[M].阮平望,译.北京:清华大学出版社,1993.
6Dahl G E, Yu D, Deng L, et al. Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition.IEEE Trans on Audio, Speech, and Language Processing, 2012, 20 ( 1 ) : 30-42.
7Hinton G E, Osindero S, Teh Y W. A Fast Learning Algorithm for Deep Belief Nets. Neural Computation, 2006, 18(7) : 1527-1554.
8Beulen K, Ney H. Automatic Question Generation for Decision Tree Based State Tying//Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Seattle, USA, 1998, II: 805 -805.
9Singh R, Raj B, Stern R M. Automatic Clustering and Generation of Contextual Questions for Tied States in Hidden Markov Models // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Phoenix, USA, 1999, I: 117-120.
10Huang J T, Li J Y, Yu D, et al. Cross-Language Knowledge Trans- fer Using Muhilingual Deep Neural Network with Shared Hidden Layers//Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada, 2013 : 7304- 7308.

共引文献28

1张金溪,徐慧,李照耀.藏语语音处理中对MFCC参数提取的研究[J].无线互联科技,2012,9(11):141-141. 被引量：1
2尚新闻.浅谈计算机的硬件维护之外设维护[J].无线互联科技,2012,9(12):133-133. 被引量：1
3李冠宇,于洪志,吴志强.一种语料缺乏条件下的藏语音素自动切分方法[J].计算机工程与科学,2014,36(10):2009-2013. 被引量：2
4袁胜龙,郭武,戴礼荣.基于深层神经网络的藏语识别[J].模式识别与人工智能,2015,28(3):209-213. 被引量：14
5赵尔平,王聪华,党红恩,雒伟群.藏语孤立词语音识别技术研究[J].西北师范大学学报（自然科学版）,2015,51(5):50-54. 被引量：6
6王辉,赵悦,刘晓凤,徐晓娜,周楠,许彦敏.基于深度特征学习的藏语语音识别[J].东北师大学报（自然科学版）,2015,47(4):69-73. 被引量：8
7陈斌,胡平舸,屈丹.子空间域相关特征变换与融合的语音识别方法[J].西安交通大学学报,2016,50(4):60-67. 被引量：4
8张圣,郭武.采用通用语音属性建模的说话人确认[J].小型微型计算机系统,2016,37(11):2577-2581. 被引量：2
9王庆楠,郭武,解传栋.基于端到端技术的藏语语音识别[J].模式识别与人工智能,2017,30(4):359-364. 被引量：8
10郭晓洁,陈良,沈长青,刘承建.自适应深度卷积神经网络在人脸识别上的应用[J].自动化技术与应用,2017,36(7):72-77. 被引量：11

1Hangming Zhang,Feng Sun,Xiaopu Zhang,Lingling Zheng.License Plate Recognition Model Based on CNN+LSTM+CTC[J].国际计算机前沿大会会议论文集,2019(2):675-678.
2Nirosha J. Murugan,Lukasz M. Karbowski,Michael A. Persinger.Cosic’s Resonance Recognition Model for Protein Sequences and Photon Emission Differentiates Lethal and Non-Lethal Ebola Strains: Implications for Treatment[J].Open Journal of Biophysics,2015,5(1):35-43.
3仁青东主,尼玛扎西.基于深度学习的自然场景藏文识别研究[J].高原科学研究,2019,3(4):96-103. 被引量：9
4Cairong ZHAO,Kang CHEN,Di ZANG,Zhaoxiang ZHANG,Wangmeng ZUO,Duoqian MIAO.Uncertainty-optimized deep learning model for small-scale person re-identification[J].Science China(Information Sciences),2019,62(12):16-28. 被引量：6
5Belt and Road Textile Conference 2019 Focusing on the international layout of the industry for “Textile Together-a community of shared future”[J].China Textile,2019(12):16-18.
6Xin Li,Dongdong Gao,Haijiang Hao.Method for Recognition Pneumonia Based on Convolutional Neural Network[J].国际计算机前沿大会会议论文集,2019(2):155-156.
7Zuo-Ping Tan,Shi-Tong Wang,Zhao-Hong Deng,Guo-Cheng Du.Fermentation process modeling of exopolysaccharide using neural networks and fuzzy systems with entropy criterion[J].Journal of Biomedical Science and Engineering,2010,3(4):430-438.
8Le Yi Wang,George MMcKelvey,Hong Wang.Multi-outcome predictive modelling of anesthesia patients[J].The Journal of Biomedical Research,2019,33(6):430-434.
9Niall J.English,Mohammad Reza Ghaani.Hybrid versus global thermostatting in molecular-dynamics simulation of methane-hydrate crystallisation[J].Chinese Journal of Chemical Engineering,2019,27(9):2180-2188.
10Daoming Shen,Hua Si,Jinhong Xia,Shunqun Li.A New Model for the Characterization of Frozen Soil and Related Latent Heat Effects for the Improvement of Ground Freezing Techniques and Its Experimental Verification[J].Fluid Dynamics & Materials Processing,2019,15(1):63-76. 被引量：2

Journal on Internet of Things

2019年第1期

浏览历史

内容加载中请稍等...

Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode

参考文献3

二级参考文献19

共引文献28

相关作者

相关机构

相关主题

浏览历史