摘要
提出了一种采用最小贝叶斯信息准则(MinimumBayesianInformationCriterion,MBIC)来最优化控制决策树结点分裂程度的算法。首先在理论上证明了MBIC能够较好地解决模型参数复杂度与训练数据集规模之间的权衡问题,然后给出了基于MBIC的决策树分裂停止准则的计算公式。汉语连续语音全音节识别实验表明:与传统的最大似然准则(MaximumLikeihoodCriterion,MLC)相比,MBIC对声学模型参数和训练数据集的变化具有更好的适应能力。
an algorithm based on Minimum Bayesian Information Criterion(MBIC) was proposed to help optimize the node-splitting degree in a decision tree. First, it was proved in theory that MBIC can find a good balance between the complexity of model parameters and the scale of the training sets. Then, a formula was proposed to describe MBIC decision tree splitting and stopping criterion. Finally, the experiment on Chinese aU-syllable recognition shows that MBIC has much better adaptive ability to variable acoustic model parameters and training sets than the classical Maximum Likeihood Criterion method.
出处
《计算机应用》
CSCD
北大核心
2005年第12期2792-2795,共4页
journal of Computer Applications
关键词
连续语音识别
决策树聚类
最小贝叶斯信息准则
分裂停止准则
Continuous Speech Recognition( CSR)
clustering based on decision-tree
Minimum Bayesian Information Criterion(MBIC)
splitting and stopping criterion