摘要
Boosting是十分有效的序列集成算法,在实践中有着广泛的应用。本文着重研究的XGBoost算法针对快速并行树结构进行了优化,并且在分布环境下容错,这使得它可以精确快速处理亿级数据、给出可靠的结果。本文分别通过模拟分析和实证分析,对比GBDT和RF算法验证了XGBoost的优良特性。
Boosting is a very effective sequenceintegration algorithm,which has a wide range of applications in practice.Thispaper focuses on the XGBoost algorithm,which optimizes the structure of fastparallel tree and is fault-tolerant in distributed environment.This makes itpossible to process billions of data accurately and quickly and give reliableresults.In this paper,through simulation analysis and empirical analysis,compared with GBDT and RF algorithm,XGBoost’s excellent characteristics areverified.
出处
《计算机科学与应用》
2019年第5期1029-1035,共7页
Computer Science and Application
基金
北京市教委基础科研计划项目(大数据的统计学基础理论与分析方法)。