摘要
基于主题的自适应语言模型能有效地解决语言模型跨主题应用的问题 ,针对其面临的两个主要问题———语料的分类和各语言模型的融合 ,采用了一种新的语料分类算法 ,突破了原有分类方法的一些局限性 ,并提出了一种改进的融合各语言模型的方法 :概率 +线性插值法 ,该方法既改善了语言模型的性能 。
A topic based language model effectively solves the problem of cross domain application of a statistical language model There exist two questions, how to cluster the corpus to different topics and how to combine the topic specific language models First, a new method is adopted to cluster the corpus that has overcome some limitations of the old one Second, an improved algorithm is proposed to combine different language models Not only has the new method improved the performance, but also accelerated the model
出处
《计算机研究与发展》
EI
CSCD
北大核心
2003年第9期1368-1374,共7页
Journal of Computer Research and Development
基金
国家自然科学基金 ( 60 2 0 3 0 0 7)
国家"八六三"高技术研究发展计划重大项目基金 ( 2 0 0 1AA114 0 40 )
关键词
语言模型
自适应
主题
分类
language model
adaptive
topic based
cluster