摘要
大多数CTR预测的算法都是将特征嵌入初始化为一个固定的维度,忽略了长尾物品特征的流行度不高。把它和头部物品的嵌入向量设置为相同长度会导致模型训练不平衡,影响最后的预测结果。基于此,本文首先使用一个端到端的可微框架,该框架可以根据特征的流行度自动选择不同的嵌入维度。其次,引入挤压激励网络机制和具有残差连接的多头自注意力机制,分别从不同角度动态地学习特征的重要性以及识别重要的特征组合,然后使用图神经网络代替传统内积和哈达玛积显式建模二阶特征交互。最后为了进一步提高性能,将DNN组件与浅层模型相结合形成深度模型,利用贝叶斯优化算法为深度模型选择一组超参数,避免复杂的调参过程,并且在2个基准数据集上实验,结果验证模型的有效性。
Most CTR prediction algorithms initialize the feature embedding as a fixed dimension,ignoring the low popularity of the long tail feature.Setting it to the same length as the head object embedding vector will lead to unbalanced model training and affect the final recommendation results.Based on this,this paper first uses an end-to-end differentiable framework,which can automatically select different embedded dimensions according to the popularity of features.Secondly,this paper introduces squeeze excitation network mechanism and multi-head self-attention mechanism with residual connection to dynamically learn the importance of features and identify important feature combinations from different angles,and then uses graph neural network to explicitly model the second-order feature interaction instead of traditional inner product and Hadamard product.Finally,in order to further improve the performance,this paper combines the DNN component with the shallow model to form the depth model,uses the Bayesian optimization algorithm to select a set of super parameters for the depth model to avoid the complex parameter adjustment process,and the experimental results on two benchmark datasets verify the effectiveness of the model.
作者
夏义春
李汪根
李豆豆
葛英奎
王志格
XIA Yi-chun;LI Wang-gen;LI Dou-dou;GE Ying-kui;WANG Zhi-ge(School of Computer and Information,Anhui Normal University,Wuhu 241002,China)
出处
《计算机与现代化》
2023年第3期29-37,共9页
Computer and Modernization
基金
高校领军人才引进与培育计划项目(051619)。
关键词
点击率预测
自动嵌入搜索
挤压激励网络
多头自注意力机制
图神经网络
贝叶斯优化
CTR prediction
automatic embedded search
squeeze excitation network
multi-head self-attention mechanism
graph neural network
Bayesian optimization