摘要
【目的】针对现有论文推荐方法在处理论文作者映射关系稀疏和特征表达时存在成效不足的问题,开发一种基于因子分解机和集成学习的新型论文推荐框架。【方法】使用卷积神经网络、网络嵌入等方法处理数据获取特征表示,将特征矩阵输入因子分解机,引入随机子空间法集成训练模型,最后通过投票机制协同后输出推荐结果。【结果】基于CiteULike数据集的实验结果表明,本文方法的推荐精确率、准确率和F度量分别为72.6%、69.7%和76.2%,分别比基准算法提升高于20个百分点、15个百分点和9个百分点。【局限】负采样过程中缺乏正负样本语义相似性的考虑,在模型的输入构造、特征处理模式方面有待进一步探究。【结论】集成因子分解机能在数据稀疏情况下实现特征的有效表示和利用,从而提升推荐效果。
[Objective]This study proposes an improved paper recommendation framework based on Ensemble Learning and Factorization Machine.It addresses the issues of the existing methods,such as difficulties in processing sparse data and representing features.[Methods]First,we used Convolutional Neural Network,Network Embedding,and other algorithms to obtain feature representations,which were processed by Factorization Machine learners.Homogeneous weak Factorization Machine learners are then trained based on Ensemble Learning.We integrated these weak learners into a stronger learner through the voting mechanism and generated the final recommendations.[Results]We examined the new model with the CiteULike dataset,and the Precision,Accuracy,and F-Measure reached 72.6%,69.7%,and 76.2%,respectively,20%,15%,and 9%higher than the benchmark algorithms.[Limitations]The input,sampling strategy,and processing mode need to be further explored.[Conclusions]The proposed Ensemble Factorization Machine enables effective representation and utilization of sparse data features,enhancing the recommendation performance.
作者
杨辰
郑若桢
王楚涵
耿爽
王楠
Yang Chen;Zheng Ruozhen;Wang Chuhan;Geng Shuang;Wang Nan(College of Management,Shenzhen University,Shenzhen 518060,China)
出处
《数据分析与知识发现》
CSCD
北大核心
2023年第8期128-137,共10页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目(项目编号:71701134,71901150)
广东省基础与应用基础研究基金资助项目(项目编号:2019A1515011392)的研究成果之一。
关键词
论文推荐
因子分解机
集成学习
Research Paper Recommendation
Factorization Machine
Ensemble Learning