We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning...We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning is fast. Compared withConvolutional Neural Network, it has a simpler and understood structure and lessparameters to learn. Experimental results show that the advantage of hybridLRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classificationarchitecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN ishelpful to differentiate among multiple language speech sets.展开更多
文摘We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning is fast. Compared withConvolutional Neural Network, it has a simpler and understood structure and lessparameters to learn. Experimental results show that the advantage of hybridLRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classificationarchitecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN ishelpful to differentiate among multiple language speech sets.