期刊文献+

基于集成学习方法的点击率预估模型研究 被引量:3

An advertisement click-through rate prediction model based on ensemble learning
下载PDF
导出
摘要 由于互联网中积累的广告日志具有数据稀疏、特征量大、正负样本分布极其不均匀等问题,使得人工特征提取费时费力,并且单一预测模型很难得到更好的预测性能。针对这些问题,提出梯度提升树GBDT和Stacking相融合的点击率预测模型GBDT-Stacking。通过引入梯度提升树自动进行特征提取与构造,并结合Stacking集成模型对在线广告点击率进行预测,有效提高了单个预测模型的性能。在真实广告数据集上的实验结果表明,GBDT-Stacking集成模型比对比模型在AUC的取值上至少提升了4%。 Because the accumulated advertisement logs in the Internet have the problems of sparse data,a large number of features and extremely unbalanced distribution of positive and negative samples,manual feature extraction is time-consuming and laborious,and it is difficult for a single prediction model to obtain better prediction performance.In response to these problems,this paper completes a click through rate prediction model based on GBDT model and stacking.This model uses GBDT model to automatically extract and construct features,and predicts and classifies click-through rate by Stacking model,which effectively improves the performance of the single prediction model.Experiments on real advertising data sets show that the GBDT-Stacking ensemble method increases the AUC value by at least 4%compared to the comparison model.
作者 贺小娟 潘文捷 程宏 HE Xiao-juan;PAN Wen-jie;CHENG Hong(School of Statistics and Information,Shanghai University of International Business and Economics,Shanghai 201620;School of Statistics and Mathematics,Shanghai Lixin University of Accounting and Finance,Shanghai 201209,China)
出处 《计算机工程与科学》 CSCD 北大核心 2019年第12期2278-2284,共7页 Computer Engineering & Science
基金 2016年上海市青年科技英才扬帆计划(16YF1415900) 上海立信会计金融学院统计学一级学科建设项目
关键词 梯度提升树 Stacking集成学习 SMOTE 广告点击率 GBDT(gradient boosted decision tree) Stacking ensemble learning SMOTE(synthetic minority oversampling technique) click-through rate
  • 相关文献

参考文献9

二级参考文献138

  • 1CR—Nielsen.CRNielsen发布2010年上半年中国互联网广告市场简报.http://www.cr—nielsen.com/wangluo/trend/201007/291758.html,2010.7.
  • 2eMarketer. Online Ad Spend Surpasses Newspapers. http://affiliate program, amazon, com/gp/advertising/api/ detail/main, html. 2010.12.
  • 3David Ogilvy. Ogilvy on Advertising. Vintage, 1985. 12.
  • 4Phillip Nelson. Advertising as information. The Journal of Political Economy, 1974, 82(4): 729 754.
  • 5新浪.新浪微博用户超过1亿,开始进军电子商务市场.http://tech.sina.com.cn/i/2011-03-02/17395237059.shtml.2011.3.
  • 6新浪.Twitter董事长称全球用户数已突破2亿.http://teeh.sina.com.cn/i/2011—01—12/17495087422.shtml,20l1.1.
  • 7eMrketer. Twitter ad revenues to soar this year. http:// wwwl. emarketer, com /Article. aspx?R= 1008192& AspxAutoDetectCookieSupport= 1, 2011.1.
  • 8Regelson M, Fain D. Predicting click through rate using keyword clusters//Proceedings of the 2nd Workshop on Sponsored Search Auctions. 2006.
  • 9Broder A, Ciccolo P, Gabrilovich E, Josifovski V, Metzler D, Riedel L, Yuan J. Online expansion of rare queries for sponsored search//Proceedings of the SIGIR. 2009.
  • 10Radlinski F, Broder A, Ciccolo P, Gabrilovich E, Josifovski V, Riedel L. Optimizing relevance and revenue in ad search: A query substitution approach//Proceedings of the SIGIR. 2008.

共引文献99

同被引文献23

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部