摘要
针对油气勘探数据复杂性和特殊性,以及图谱创建初期数据贫乏的业务痛点,提出了一种适用于油气勘探的小数据集的知识表示和融合方法。为了解决知识融合中数据异构问题,采用数据库和网络本体语言(Web Ontology Language,OWL)语言相结合的方法进行图谱表示,让多种结构的数据可以用统一的方式相结合,同时兼顾数据内容的可靠性。为了解决训练数据有限,模型和聚类算法难以使用等问题,基于多种融合的技术,在本体融合和实体融合两个方面进行全方位研究。其中:本体通过无监督学习基于文本匹配算法计算图谱节点特征的相似度,通过主动学习以人机交互的形式修正数据,获得最终结果;实体融合基于本体融合的思路,通过逆向文档频率(Inverse Document Frequency,IDF)系数和结构相似度等算法优化两个实体相似度的准确率。该技术力求在油气勘探领域找到一种易于优化、对数据要求较低、效果相对较好的知识融合框架。
With the increasingly extensive application of Big data in the field of oil and gas exploration,many experts in the field of oil and gas exploration have begun to establish Knowledge graph to integrate relevant knowledge,but there are still many"knowledge islands",and oil and gas exploration personnel are still unable to share these knowledge.In view of the complexity and particularity of oil and gas exploration data and the lack of data at the initial stage of map creation,this paper proposes a Knowledge representation and reasoning and fusion method for small data sets suitable for oil and gas exploration,which has irreplaceable significance for map construction in the oil and gas exploration field.In order to solve the problem of data heterogeneity in knowledge fusion,the technology described in this paper adds a Knowledge representation and reasoning module,and uses the method of combining database and Web Ontology Language(OWL)language to represent the atlas,so that data with multiple structures can be combined in a unified way,while taking into account the reliability of data content.Knowledge fusion can be divided into ontology fusion and entity fusion.There are many fusion technologies,including embedded representation based on model training,and many Unsupervised learning methods.The technology described in this paper is mainly for the initial stage of map construction in the oil and gas exploration professional field,where data collection is difficult,the amount of data available for training is very limited,and various models and clustering algorithms are difficult to use.Ontology calculates the similarity of map node features based on the text matching algorithm through Unsupervised learning,and then corrects the data in the form of human-computer interaction through active learning to obtain the final result.Entity fusion follows the approach of ontology fusion,and on this basis,the accuracy of the similarity between the two entities is optimized through algorithms such as the Inverse Document Frequency(IDF)coefficient and structural similarity.This technology strives to find a knowledge fusion framework in the field of oil and gas exploration that is easy to optimize,has low data requirements,and has relatively good results.
作者
田芷瑜
卫乾
赵世亮
许野
TIAN Zhiyu;WEI Qian;ZHAO Shiliang;XU Ye(Kunlun Digital Intelligence Technology Co.,Ltd.,Beijing 102206,China)
出处
《信息与电脑》
2023年第19期164-170,共7页
Information & Computer
关键词
知识表示
知识融合
知识图谱
油气勘探
无监督学习
knowledge representation
knowledge fusion
knowledge graph
geological exploration
unsupervised learning