摘要
通过对运用Gibbs采样的Latent Dirichlet Allocation(LDA)算法和MapReduce计算框架的细致研究,实现了LDA算法在Mahout下的分布式并行计算.详细地考察了该分布式并行计算程序的计算性能,并深入地探讨了一些影响计算性能的关键问题.
In a careful study of Latent Dirichlet Allocation(LDA) using Gibbs sampling and the MapReduce framework,an efficient implementation for LDA in Mahout was achieved.The experiments showed the high performance of this distributed parallel LDA program,and several issues about enhancing performance were discussed.
出处
《华东师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2013年第3期118-130,共13页
Journal of East China Normal University(Natural Science)