摘要
为了向需求维度不同的买家科学地提供基于煤炭产品自身品质及其发售地区差异性推荐意见,文中提出了一种基于卷积神经网络与自然语言处理相结合的文本卷积推荐模型(text CNN recommendation model,TCRM)。该模型由各字段单词特征生成嵌入向量,在不同尺寸的卷积核上做卷积和最大池化,再通过全连接层合并所有特征分别得出用户及煤炭产品特征矩阵,将用户特征和煤炭产品特征作为输入,经过全连接的方式进行计算以产生评分预测值。在特征矩阵维数为32,经过20次迭代后模型损失固定在1.0附近,准确率维持在0.25左右,在模型损失率及健壮性上均优于传统数值比例特征运算计算评分时的模型。这样的推荐方式改善了以往由于单因素评分而带来的近邻相似度计算不准确和矩阵稀疏导致的推荐结果不理想的情况。在大量真实数据集上的实验结果显示,在此算法基础上的深层次煤炭产品推荐具有良好的效果。
In order to provide scientific recommendations for buyers with different demand dimensions based on the quality of coal products and the differences in their selling areas,we propose a text CNN recommendation model(TCRM)based on the combination of convolution neural network and natural language processing.In this model,the embedded vector is generated from the word features of each field,convolution and maximum pooling are done on convolution kernels of different sizes,and then the user and coal product feature matrix are obtained by combining all features in the full connection layer.The user feature and coal product feature are used as input and calculated in a fully connected way to generate the predicted score.After 20 iterations,when feature matrix dimension is 32,the model loss is fixed near 1.0,and the accuracy is maintained at about 0.25.The model loss rate and robustness are better than those of the traditional numerical scale feature calculation model.This kind of recommendation method improves the previous situation that the calculation of neighbor similarity is not accurate and the result of recommendation is not ideal because of the single factor score.The experimental results on a large number of real datasets show that the deep level coal product recommendation based on this algorithm is effective.
作者
潘理虎
郝彦杰
周耀辉
龚大立
PAN Li-hu;HAO Yan-jie;ZHOU Yao-hui;GONG Da-li(School of Computer and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China;Huajin Coking Coal Group of Shanxi Province,Lvliang 033300,China;Jingying Shuzhi Technology Co.,Ltd.,Taiyuan 030024,China)
出处
《计算机技术与发展》
2021年第4期198-203,共6页
Computer Technology and Development
基金
山西省中科院科技合作项目(20141101001)
山西省科技攻关项目(20141039)
山西省重点研究计划项目(201603D121031)。
关键词
自然语言处理
词嵌入
文本卷积
特征矩阵
评分预测
natural language processing
word embedding
text convolution
feature matrix
score prediction