Speech recognition based on a combination of acoustic features with articulatory information

Speech recognition based on a combination of acoustic features with articulatory information

导出

摘要 The contributions of the static and dynamic articulatory information to speech recognition were evaluated, and the recognition approaches by combining the articulatory information with acoustic features were discussed. Articulatory movements were observed by the Electromagnetic Articulographic System for reading speech, and the speech signals were recorded simultaneously. First, we conducted several speech recognition experiments by using articulatory features alone, consisting of a number of specific articulatory channels, to evaluate the contribution of each observation point on articulators. Then, the displacement information of articulatory data were combined with acoustic features directly and adopted in speech recognition. The results show that articulatory information provides with additional information for speech recognition which is not encoded in acoustic features. Furthermore, the contribution of the dynamic information of the articulatory data was evaluated by combining them in speech recognition. It is found that the second derivative of articulatory information provided quite larger contribution to speech recognition comparing with the second derivative of acoustical information. At last, the combination methods of articulatory features and acoustic ones were investigated for speech recognition. The basic approach is that the Bayesian Network (BN) is added to each state of HMM, where the articulatory information is represented by the BN as a factor of observed signals during training the model and is marginalized as a hidden variable in recognition stage. Results based on this HMM/BN framework show a better performance than the traditional method. The contributions of the static and dynamic articulatory information to speech recognition were evaluated, and the recognition approaches by combining the articulatory information with acoustic features were discussed. Articulatory movements were observed by the Electromagnetic Articulographic System for reading speech, and the speech signals were recorded simultaneously. First, we conducted several speech recognition experiments by using articulatory features alone, consisting of a number of specific articulatory channels, to evaluate the contribution of each observation point on articulators. Then, the displacement information of articulatory data were combined with acoustic features directly and adopted in speech recognition. The results show that articulatory information provides with additional information for speech recognition which is not encoded in acoustic features. Furthermore, the contribution of the dynamic information of the articulatory data was evaluated by combining them in speech recognition. It is found that the second derivative of articulatory information provided quite larger contribution to speech recognition comparing with the second derivative of acoustical information. At last, the combination methods of articulatory features and acoustic ones were investigated for speech recognition. The basic approach is that the Bayesian Network (BN) is added to each state of HMM, where the articulatory information is represented by the BN as a factor of observed signals during training the model and is marginalized as a hidden variable in recognition stage. Results based on this HMM/BN framework show a better performance than the traditional method.

作者 LUXugang DANGJianwu

机构地区 JapanAdvancedInstituteofScienceandTechnology JapanAdvancedInstituteofScienceandTechnology

出处《Chinese Journal of Acoustics》 2005年第3期271-279,共9页 声学学报（英文版）

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献1

1DANG Jianwu,Kiyoshi Honda.A physical articulatory model for simulating speech production process.Acoustic[].Science and Technology.2001

1WANG Chengyou,TANG Shuqi,LIANG Diannong,CHEN Huihuang and TANG Zhaojing(National University of Defence Technology Changsha 410073)Received.The methods for combining the information of various kinds of features in speech recognition[J].Chinese Journal of Acoustics,1997,16(2):115-120.
2栅氧化可靠性的联合模型[J].电子产品可靠性与环境试验,2001(6):47-47.
3YIN Hui XIE Xiang KUANG Jingming.Acoustic features based on auditory model and adaptive fractional Fourier transform for speech recognition[J].Chinese Journal of Acoustics,2011,30(4):453-463.
4Liu Gang Chen Wei Guo Jun.Novel Active Learning Method for Speech Recognition[J].China Communications,2010,7(5):29-39. 被引量：1
5Junhong ZHAO,Ji XU,Wei-qiang ZHANG,Hua YUAN,Jia LIU,Shanhong XIA.Exploiting articulatory features for pitch accent detection[J].Journal of Zhejiang University-Science C(Computers and Electronics),2013,14(11):835-844.
6张月辉,龚仕仙.基于球体-椭球体联合模型的中段目标几何特性反演[J].国防科技大学学报,2010,32(6):42-47. 被引量：1
7Maglco推出V3高级音箱[J].高保真音响,2007(9):8-8.
8朱金秀,李莉.基于无线噪声信道的编码端码率控制算法研究[J].计算机应用研究,2013,30(4):1166-1169.
9陈宏明,章慧.网络数据超维联合模型下的信道损伤估计[J].微电子学与计算机,2014,31(7):133-136.
10WuYuanqing,HaoJie,等.Auditory-Spectrum　Quantization　Based　Speech　Recognition[J].通信学报,1997,18(3):26-34.

Chinese Journal of Acoustics

2005年第3期

浏览历史

内容加载中请稍等...

Speech recognition based on a combination of acoustic features with articulatory information

参考文献1

相关作者

相关机构

相关主题

浏览历史