摘要
研究了中英文混合识别系统声学建模的方法,为改善识别效果及降低混合系统的模型参数规模,提出了一种基于状态时间对准的模型距离测度和声学知识相结合的中英文音素模型聚类算法,并与其他方法进行了比较。实验结果表明,在模型参数规模等同的条件下,该算法较之于模型直接合并有了显著的提高,比基于Bhattacharyya距离和似然度距离的做法也有了不同程度的改进。
The Chinese-English bilingual acoustic modeling methods are focused on. In order to improve the system performance and decrease the model complexity, a novel clustering algorithm based on STA (State-Time Alignment) and acoustic knowledge is proporsed. Contrast experiments show that, when the complexity of the model parameters is at the same level, the proposed phone model clustering approach outperforms the simple combination of language-dependent inventories. Furthermore, STA is also better than the methods based on Bhattacharyya distance and acoustic likelihood distance.
出处
《电声技术》
2009年第11期68-71,共4页
Audio Engineering
基金
国家自然科学基金委员会与微软亚洲研究院联合资助项目(60776800)
国家高技术研究发展计划(863计划)项目(2006AA010101
2007AA04Z223
2008AA02Z414)