摘要
学术合作者推荐是学术大数据的一个有效应用。但是现存的方法忽略了学术研究者和研究主题间的上下文关系,因此不能推荐合适的合作者。该文提出了基于BERT的合作者推荐(BACR),旨在推荐高潜力的合作者以达到研究者的要求。为此,设计了一个新的推荐框架,它有两个基本组成部分:BERT(bidirectional encoder representations from transformers)预训练语言模型和逻辑回归模型(LR)。其中,BERT将研究者和研究主题联合表示得到句子层面的具有上下文关系的特征向量表示。LR将BERT输出的特征向量作为输入得到该样本为正类的概率,最后输出概率最大的前K个合作者信息。通过与基于Network Embedding的SDNE和TSE算法的对比实验,结果表明充分考虑了研究者和研究主题间的上下文关系的BERT模型得到了更好的特征向量表示,提高了合作者推荐的准确率。
Academic collaborator recommendation is an effective application of academic big data.However,existing methods ignore the contextual relationship between academic researchers and research topics,therefore they cannot recommend suitable collaborators.We propose the BERT-based collaborator recommendation(BACR),which aims to recommend high-potential collaborators to meet the requirements of researchers.To this end,we design a new recommendation framework,which consists of two basic components:BERT(bidirectional encoder representations from transformers)pre-trained language model and logistic regression model(LR).In particular,BERT jointly represents the researcher and the research topic to obtain a context-dependent feature vector representation on sentence level.LR takes the feature vector output by BERT as input to obtain the probability that the sample is positive,and finally outputs the information of the top K collaborators with the largest probability.The comparative experiments with Network Embedding-based SDNE and TSE algorithms show that the BERT model that fully takes into account the contextual relationship between the researcher and the research topic gets a better feature vector representation,which improves the accuracy of collaborator recommendation.
作者
周亦敏
黄俊
ZHOU Yi-min;HUANG Jun(School of Optical-electrical&Computer Engineering,University of Shanghai for Science&Technology,Shanghai 200093,China)
出处
《计算机技术与发展》
2021年第3期45-51,共7页
Computer Technology and Development
基金
上海市科委科研计划项目(17511107203)。