摘要
为了进一步改进数据质量,提出了双向全局过滤的自举策略,不仅考虑单向的、局部的对齐,还采取具有一对一约束的最近邻选择算法捕捉全局结构和双向对齐信息,从全局确保一个源实体与一个目标实体对齐,从而减少错误样本并生成高质量的训练数据。最终在3个真实世界中的跨语言数据集上的实验结果Hits@1平均稳定在96%左右,这表明本文方法能够有效地自动标注训练数据,并产生高质量的对齐结果,从而提高实体对齐的准确性和可靠性。该方法对于知识图谱的合并和扩展具有广泛的应用前景。
In order to improve the data quality,a bootstrap strategy of bidirectional global filtering was proposed.This strategy not only considered unidirectional and local alignments,but also employed a nearest neighbor selection algorithm with one-to-one constraints to capture global structures and bidirectional alignment information.This globally ensured that one source entity was aligned with one target entity,reducing erroneous samples and generating high-quality training data.Experimental results about Hits@1 on three real-world cross-lingual datasets stablize around 96%.This indicates the training data could be effectively automatically annotated and that high-quality alignment results could be produced,thereby enhancing the accuracy and reliability of entity alignment.This method holds significant potential for the integration and expansion of knowledge graphs.
作者
林学渊
鄂海红
宋文宇
罗浩然
宋美娜
LIN Xueyuan;E Haihong;SONG Wenyu;LUO Haoran;SONG Meina(School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处
《中国科技论文在线精品论文》
2023年第2期144-147,共4页
Highlights of Sciencepaper Online
基金
国家自然科学基金(62176026)
国家自然科学基金(青年科学基金)(61902034)
北京市自然科学基金(M22009)
关键词
人工智能
知识图谱
实体对齐
自举
artificial intelligence
knowledge graph
entity alignment
bootstrapping