期刊文献+

LDA算法在Mahout下的高效实现(英文) 被引量:2

Efficient implementation for LDA in Mahout
下载PDF
导出
摘要 通过对运用Gibbs采样的Latent Dirichlet Allocation(LDA)算法和MapReduce计算框架的细致研究,实现了LDA算法在Mahout下的分布式并行计算.详细地考察了该分布式并行计算程序的计算性能,并深入地探讨了一些影响计算性能的关键问题. In a careful study of Latent Dirichlet Allocation(LDA) using Gibbs sampling and the MapReduce framework,an efficient implementation for LDA in Mahout was achieved.The experiments showed the high performance of this distributed parallel LDA program,and several issues about enhancing performance were discussed.
出处 《华东师范大学学报(自然科学版)》 CAS CSCD 北大核心 2013年第3期118-130,共13页 Journal of East China Normal University(Natural Science)
关键词 LATENT DIRICHLET ALLOCATION GIBBS采样 Mahout 分布式并行计算 MapReduce计算框架 Latent Dirichlet Allocation Gibbs sampling Mahout distributed parallel computing MapReduce framework
  • 相关文献

参考文献13

  • 1BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003 (3):993-1022.
  • 2GRIFFITHS T L,STEYVERS M.Finding scientific topics[J].Proceedings of the National Academy of Sciences, 2004(101):5228-5235.
  • 3VENNER J.Pro Hadoop[M].New York:Apress,2009.
  • 4OWEN S,ANIL R,DUNNING T,FRIEDMAN E.Mahout in Action[M].New York:Manning Publications, 2010.
  • 5STEYVERS M,GRIFFITHS T.Probabilistic topic models[M]//LANDAUER T,MCNAMARA D,DENNIS S, et al.Latent Semantic Analysis:A Road to Meaning.[s.l.]:Routledge,2007.
  • 6HEINRICH G.Parameter estimation for text analysis[R].Darmstadt:Fraunhofer IGD,2004.
  • 7NEWMAN D,ASUNCION A,SMYTH P,WELLING M.Distributed inference for latent Dirichlet allocation[J]. Proc Neural Information Processing Systems,2007(20):1081-1088.
  • 8WANG Y,BAI H J,STANTON M,et al.PLDA:Parallel Latent Dirichlet Allocation for Large-Scale Applications [M].Lecture Notes in Computer Science 5564.Berlin:Springer,2009:301-314.
  • 9GRIFFITHS T L,STEYVERS M.A probabilistic approach to semantic representation[C]// Proceedings of the Twenty-Fourth Annual Conference of Cognitive Science Society,2002.
  • 10LIU Z Y,ZHANG Y Z,CHANG E Y.PLDA+:parallel latent Dirichlet allocation with data placement and pipeline processing[J].ACM Transactions on Intelligent Systems and Technology,2011(2):26.

同被引文献15

  • 1韩争胜,李映,张艳宁.基于LDA算法的人脸识别方法的比较研究[J].微电子学与计算机,2005,22(7):131-133. 被引量:20
  • 2黄轩宇.基于KL投影LDA人脸识别及正交辨识分析[A].江苏省通信学会论文集[C].南京:2004.
  • 3Cui Y, Zhang R, Li W, Mao J. Bid landscape forecasting in online ad exchange marketplace [C]//Proceedings of the 17th A CM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2011: 265-273.
  • 4Yu L, Liu H. Feature selection for high-dimensionM data: A fast correlation-based filter solution [J]. Proceedings of International Conferences on Machine Learning, 2003, 20(2): 856-863.
  • 5Das S K. Feature selection with a linear dependence measure [J]. IEEE Transactions on Com- puters, 1971, 20(9): 1106-1109.
  • 6Quinlan J. C4.5: Programs for Machine Learning [M]. San Francisco: Morgan Kaufmann, 1993.
  • 7Press W H, Flannery B P, Teukolsky S A, Vetterling W T. Numerical Recipes in C [M]. Cambridge: Cambridge University Press, 1988, 10(1): 195-196.
  • 8Mclachlan G, Peel D. Finite mixture models [J]. Encyclopedia of Machine Learning, 2000, 39(4): 521-541.
  • 9Jain A K, Figueiredo M A T. Unsupervised learning of finite mixture models [J]. IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 2002, 24(3): 381-396.
  • 10Muthdn B. Finite mixture modeling with mixture outcomes using the EM algorithm [J]. Bio- metrics, 1999, 55(2): 463-469.

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部