摘要
针对知识图谱数据类别不平衡与训练难度不同,随机进行训练数据采样可能导致嵌入模型不能快速收敛的问题,提出了一种自适应的筛选训练数据方法。对训练数据按照关系类别进行分组,采样过程中首先根据概率选择关系类别,然后从选定的分组中随机选择一个实例进行训练。根据训练效果对每组实例被选择的概率进行自适应调整。实验结果表明:自适应的分组筛选在链接预测任务上取得了更好的结果,使嵌入模型更快、更好地收敛。
Due to the imbalance of KG data and the difficulty of training,that random sampling of training data may make it difficult for embedded models to converge rapidly. Therefore,in this paper,an adaptive method for sampling of training data is proposed. The training data are grouped according to the different relationships. In the sampling process,a group is determined according to the probability,and then an instance is randomly selected from the determined group for training. At the same time,according to the training effect,the probability of each selected instance is adjusted adaptively. Experimental results show that adaptive grouping filter achieves better results in link prediction tasks,and enables the embedded model to converge faster and better.
作者
欧阳丹彤
马骢
雷景佩
冯莎莎
OUYANG Dan-tong;MA Cong;LEI Jing-pei;FENG Sha-sha(College of Computer Science and Technology Jilin University,Changchun 130012,China;Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University Changchun 130012,China)
出处
《吉林大学学报(工学版)》
EI
CAS
CSCD
北大核心
2020年第2期685-691,共7页
Journal of Jilin University:Engineering and Technology Edition
基金
国家自然科学基金项目(61872159,61672261,61502199).
关键词
人工智能
知识图谱嵌入
基于翻译的嵌入模型
自适应筛选
链接预测
artificial intelligence
knowledge graph embedding
translation-based embedding models
adaptive sampling
link prediction