摘要
在自然语言处理领域,LDA主题模型是进行文本语义挖掘的一种统计模型,用来发现文档中的隐含主题,将词项空间表达的文档约简为主题空间的低维表达,实现信息检索、文本分类等。本文阐述了LDA模型的文档生成过程、LDA模型的图模型表示、基于LDA的扩展模型以及未来的研究趋势。
In natural language processing, LDA (Latent Dirichlet Allocation) topic model is a probabilistic model in text semantic mining. LDA is a dimensionality reduction technique to reduce a document represented by words to a random mix- ture over latent topics, and to realize information retrieval and text categorization. The paper presents the generative process for each document in a corpus and the graphical model representation of LDA. Based on the aboved, the paper also discusses the extended model associated with LDA and the future research trend.
出处
《智能计算机与应用》
2014年第5期105-106,共2页
Intelligent Computer and Applications
关键词
自然语言处理
主题模型
Natural Language Processing
Topic Model
LDA