摘要
随着互联网广告的飞速发展,如何预测目标用户对互联网广告的点击率(click-through rate,简称CTR),成为精确广告推荐投放的关键技术,并成为计算广告领域的研究热点和深度神经网络的应用热点.为了提高广告点击率预估的精确度,提出了基于深度置信网络的广告点击率预估模型,并通过基于Kaggle数据挖掘平台数据集的1000万条随机数据的实验,研究不同的隐藏层层数和隐含节点数目对预测结果的影响.为了解决深度置信网络在数据规模较大的工业界解决方案中的训练效率问题,通过实验证明:广告点击率预估中,深度置信网络的损失函数存在大量的驻点,并且这些驻点对网络训练效率有极大的影响.为了提高模型效率,从发掘网络损失函数特性入手,进一步提出了基于随机梯度下降算法和改进型粒子群算法的融合算法,以优化网络训练.融合算法在迭代步长小于阈值时可以跳出驻点平面,继续正常迭代.实验结果表明,与传统的基于梯度提升决策树和逻辑回归的广告点击率预估模型以及模糊深度神经网络模型相比,基于深度置信网络的预估模型具有更好的预估精度,在均方误差、曲线下面积和对数损失函数指标上分别提升2.39%,9.70%,2.46%和1.24%,7.61%,1.30%;使用融合方法训练深度置信网络,训练效率提高30%~70%.
With the rapid development of Internet advertising,how to predict the target user's click-through rate of Internet advertisement has become a key technology for accurate advertising and has become a hot topic in the field of computational advertising and the application of deep neural networks.To improve the accuracy of CTR(click-through rate)prediction,this work proposed a prediction model based on deep belief nets and studied the influence of the number of hidden layers and the number of units in each layer on prediction results by taking experiments on the 10 million samples in the dataset provided by Kaggle Data Mining platform.In order to solve the problem of training efficiency of deep belief nets in large-scale industrial solutions,this study took wide experiments to prove that there are a lot of stagnation points in the loss function of deep belief nets and it has great negative effect on the training process.To improve the efficiency of training,starting from the characteristics of network loss function,this study further proposed a network optimization fusion model based on stochastic gradient descent algorithm and improved particle swarm optimization algorithm.The fusion algorithm can jump out of the stagnation ground and continue the normal training process.The experiment results show that compared with the traditional prediction model based on gradient boost regression tree and logistic regression,and the deep learning model based on fuzzy deep neural network,the proposed training model has better accuracy in prediction and performs 2.39%,9.70%,2.46%and 1.24%,7.61%,1.30%better in mean squared error,area under curves,and LogLoss.The fusion method will improve the training efficiency of deep belief nets at the level of 30%~70%.
作者
陈杰浩
张钦
王树良
史继筠
赵子芊
CHEN Jie-Hao;ZHANG Qin;WANG Shu-Liang;SHI Ji-Yun;ZHAO Zi-Qian(School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China)
出处
《软件学报》
EI
CSCD
北大核心
2019年第12期3665-3682,共18页
Journal of Software
关键词
广告点击率预估
深度置信网络
驻点
粒子群算法
融合算法
click-through rate prediction
deep belief net
stagnation point
particle swarm algorithm
fusion algorithm