In the context of big data, many large-scale knowledge graphs have emerged to effectively organize the explosive growth of web data on the Internet. To select suitable knowledge graphs for use from many knowledge grap...In the context of big data, many large-scale knowledge graphs have emerged to effectively organize the explosive growth of web data on the Internet. To select suitable knowledge graphs for use from many knowledge graphs, quality assessment is particularly important. As an important thing of quality assessment, completeness assessment generally refers to the ratio of the current data volume to the total data volume.When evaluating the completeness of a knowledge graph, it is often necessary to refine the completeness dimension by setting different completeness metrics to produce more complete and understandable evaluation results for the knowledge graph.However, lack of awareness of requirements is the most problematic quality issue. In the actual evaluation process, the existing completeness metrics need to consider the actual application. Therefore, to accurately recommend suitable knowledge graphs to many users, it is particularly important to develop relevant measurement metrics and formulate measurement schemes for completeness. In this paper, we will first clarify the concept of completeness, establish each metric of completeness, and finally design a measurement proposal for the completeness of knowledge graphs.展开更多
Utilizing graph neural networks for knowledge embedding to accomplish the task of knowledge graph completion(KGC)has become an important research area in knowledge graph completion.However,the number of nodes in the k...Utilizing graph neural networks for knowledge embedding to accomplish the task of knowledge graph completion(KGC)has become an important research area in knowledge graph completion.However,the number of nodes in the knowledge graph increases exponentially with the depth of the tree,whereas the distances of nodes in Euclidean space are second-order polynomial distances,whereby knowledge embedding using graph neural networks in Euclidean space will not represent the distances between nodes well.This paper introduces a novel approach called hyperbolic hierarchical graph attention network(H2GAT)to rectify this limitation.Firstly,the paper conducts knowledge representation in the hyperbolic space,effectively mitigating the issue of exponential growth of nodes with tree depth and consequent information loss.Secondly,it introduces a hierarchical graph atten-tion mechanism specifically designed for the hyperbolic space,allowing for enhanced capture of the network structure inherent in the knowledge graph.Finally,the efficacy of the proposed H2GAT model is evaluated on benchmark datasets,namely WN18RR and FB15K-237,thereby validating its effectiveness.The H2GAT model achieved 0.445,0.515,and 0.586 in the Hits@1,Hits@3 and Hits@10 metrics respectively on the WN18RR dataset and 0.243,0.367 and 0.518 on the FB15K-237 dataset.By incorporating hyperbolic space embedding and hierarchical graph attention,the H2GAT model successfully addresses the limitations of existing hyperbolic knowledge embedding models,exhibiting its competence in knowledge graph completion tasks.展开更多
To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge gra...To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.展开更多
Knowledge graph(KG) link prediction aims to address the problem of missing multiple valid triples in KGs. Existing approaches either struggle to efficiently model the message passing process of multi-hop paths or lack...Knowledge graph(KG) link prediction aims to address the problem of missing multiple valid triples in KGs. Existing approaches either struggle to efficiently model the message passing process of multi-hop paths or lack transparency of model prediction principles. In this paper,a new graph convolutional network path semantic-aware graph convolution network(PSGCN) is proposed to achieve modeling the semantic information of multi-hop paths. PSGCN first uses a random walk strategy to obtain all-hop paths in KGs,then captures the semantics of the paths by Word2Sec and long shortterm memory(LSTM) models,and finally converts them into a potential representation for the graph convolution network(GCN) messaging process. PSGCN combines path-based inference methods and graph neural networks to achieve better interpretability and scalability. In addition,to ensure the robustness of the model,the value of the path thresholdKis experimented on the FB15K-237 and WN18RR datasets,and the final results prove the effectiveness of the model.展开更多
Knowledge representation learning(KRL)aims to encode entities and relationships in various knowledge graphs into low-dimensional continuous vectors.It is popularly used in knowledge graph completion(or link prediction...Knowledge representation learning(KRL)aims to encode entities and relationships in various knowledge graphs into low-dimensional continuous vectors.It is popularly used in knowledge graph completion(or link prediction)tasks.Translation-based knowledge representation learning methods perform well in knowledge graph completion(KGC).However,the translation principles adopted by these methods are too strict and cannot model complex entities and relationships(i.e.,N-1,1-N,and N-N)well.Besides,these traditional translation principles are primarily used in static knowledge graphs and overlook the temporal properties of triplet facts.Therefore,we propose a temporal knowledge graph embedding model based on variable translation(TKGE-VT).The model proposes a new variable translation principle,which enables flexible transformation between entities and relationship embedding.Meanwhile,this paper considers the temporal properties of both entities and relationships and applies the proposed principle of variable translation to temporal knowledge graphs.We conduct link prediction and triplet classification experiments on four benchmark datasets:WN11,WN18,FB13,and FB15K.Our model outperforms baseline models on multiple evaluation metrics according to the experimental results.展开更多
In the link prediction task of knowledge graph completion,Graph Neural Network(GNN)-based knowledge graph completion models have been shown by previous studies to produce large improvements in prediction results.Howev...In the link prediction task of knowledge graph completion,Graph Neural Network(GNN)-based knowledge graph completion models have been shown by previous studies to produce large improvements in prediction results.However,many of the previous efforts were limited to aggregating the information given by neighboring nodes and did not take advantage of the information provided by the edges represented by relations.To address the problem,Coupling Relation Strength with Graph Convolutional Networks(RS-GCN)is proposed,which is a model with an encoder-decoder framework to realize the embedding of entities and relations in the vector space.On the encoder side,RS-GCN captures graph structure and neighborhood information while aggregating the information given by neighboring nodes.On the decoder side,RotatE is utilized to model and infer various relational patterns.The models are evaluated on standard FB15k,WN18,FB15k-237 and WN18RR datasets,and the experiments show that RS-GCN achieves better results than the current state-of-the-art classical models on the above knowledge graph datasets.展开更多
Knowledge bases(KBs)are often greatly incomplete,necessitating a demand for KB completion.Although XLORE is an English-Chinese bilingual knowledge graph,there are only 423,974 cross-lingual links between English insta...Knowledge bases(KBs)are often greatly incomplete,necessitating a demand for KB completion.Although XLORE is an English-Chinese bilingual knowledge graph,there are only 423,974 cross-lingual links between English instances and Chinese instances.We present XLORE2,an extension of the XLORE that is built automatically from Wikipedia,Baidu Baike and Hudong Baike.We add more facts by making cross-lingual knowledge linking,cross-lingual property matching and fine-grained type inference.We also design an entity linking system to demonstrate the effectiveness and broad coverage of XLORE2.展开更多
With the development of information fusion,knowledge graph completion tasks have received a lot of attention.some studies investigate the broader underlying problems of linguistics,while embedding learning has a narro...With the development of information fusion,knowledge graph completion tasks have received a lot of attention.some studies investigate the broader underlying problems of linguistics,while embedding learning has a narrow focus.This poses significant challenges due to the heterogeneity of coarse-graining patterns.Then,to settle the whole matter,a framework for completion is designed,named Triple Encoder-Scoring Module(TEsm).The model employs an alternating two-branch structure that fuses local features into the interaction pattern of the triplet itself by perfectly combining distance and structure models.Moreover,it is mapped to a uniform shared space.Upon completion,an ensemble inference method is proposed to query multiple predictions from different graphs using a weight classifier.Experiments show that the experimental dataset used for the completion task is DBpedia,which contains five different linguistic subsets..Our extensive experimental results demonstrate that TEsm can efficiently and smoothly solve the optimal completion task,validating the performance of the proposed model.展开更多
Knowledge graphs are involved in more and more applications to further improve intelligence.Owing to the inherent incompleteness of knowledge graphs resulted from data updating and missing,a number of knowledge graph ...Knowledge graphs are involved in more and more applications to further improve intelligence.Owing to the inherent incompleteness of knowledge graphs resulted from data updating and missing,a number of knowledge graph completion models are proposed in succession.To obtain better performance,many methods are of high complexity,making it time-consuming for training and inference.This paper proposes a simple but e®ective model using only shallow neural networks,which combines enhanced feature interaction and multi-subspace information integration.In the enhanced feature interaction module,entity and relation embeddings are almost peer-to-peer interacted via multi-channel 2D convolution.In the multi-subspace information integration module,entity and relation embeddings are projected to multiple subspaces to extract multi-view information to further boost performance.Extensive experiments on widely used datasets show that the proposed model outperforms a series of strong baselines.And ablation studies demonstrate the e®ectiveness of each submodule in the model.展开更多
Currently,most existing inductive relation prediction approaches are based on subgraph structures,with subgraph features extracted using graph neural networks to predict relations.However,subgraphs may contain disconn...Currently,most existing inductive relation prediction approaches are based on subgraph structures,with subgraph features extracted using graph neural networks to predict relations.However,subgraphs may contain disconnected regions,which usually represent different semantic ranges.Because not all semantic information about the regions is helpful in relation prediction,we propose a relation prediction model based on a disentangled subgraph structure and implement a feature updating approach based on relevant semantic aggregation.To indirectly achieve the disentangled subgraph structure from a semantic perspective,the mapping of entity features into different semantic spaces and the aggregation of related semantics on each semantic space are updated.The disentangled model can focus on features having higher semantic relevance in the prediction,thus addressing a problem with existing approaches,which ignore the semantic differences in different subgraph structures.Furthermore,using a gated recurrent neural network,this model enhances the features of entities by sorting them by distance and extracting the path information in the subgraphs.Experimentally,it is shown that when there are numerous disconnected regions in the subgraph,our model outperforms existing mainstream models in terms of both Area Under the Curve-Precision-Recall(AUC-PR)and Hits@10.Experiments prove that semantic differences in the knowledge graph can be effectively distinguished and verify the effectiveness of this method.展开更多
The Knowledge Graph(KGs)have profoundly impacted many researchfields.However,there is a problem of low data integrity in KGs.The binary-relational knowledge graph is more common in KGs but is limited by less informatio...The Knowledge Graph(KGs)have profoundly impacted many researchfields.However,there is a problem of low data integrity in KGs.The binary-relational knowledge graph is more common in KGs but is limited by less information.It often has less content to use when predicting missing entities(relations).The hyper-relational knowledge graph is another form of KGs,which introduces much additional information(qualifiers)based on the main triple.The hyper-relational knowledge graph can effectively improve the accuracy of pre-dicting missing entities(relations).The existing hyper-relational link prediction methods only consider the overall perspective when dealing with qualifiers and calculate the score function by combining the qualifiers with the main triple.How-ever,these methods overlook the inherent characteristics of entities and relations.This paper proposes a novel Local and Global Hyper-relation Aggregation Embed-ding for Link Prediction(LGHAE).LGHAE can capture the semantic features of hyper-relational data from local and global perspectives.To fully utilize local and global features,Hyper-InteractE,as a new decoder,is designed to predict missing entities to fully utilize local and global features.We validated the feasibility of LGHAE by comparing it with state-of-the-art models on public datasets.展开更多
基金supported by the National Key Laboratory for Comp lex Systems Simulation Foundation (6142006190301)。
文摘In the context of big data, many large-scale knowledge graphs have emerged to effectively organize the explosive growth of web data on the Internet. To select suitable knowledge graphs for use from many knowledge graphs, quality assessment is particularly important. As an important thing of quality assessment, completeness assessment generally refers to the ratio of the current data volume to the total data volume.When evaluating the completeness of a knowledge graph, it is often necessary to refine the completeness dimension by setting different completeness metrics to produce more complete and understandable evaluation results for the knowledge graph.However, lack of awareness of requirements is the most problematic quality issue. In the actual evaluation process, the existing completeness metrics need to consider the actual application. Therefore, to accurately recommend suitable knowledge graphs to many users, it is particularly important to develop relevant measurement metrics and formulate measurement schemes for completeness. In this paper, we will first clarify the concept of completeness, establish each metric of completeness, and finally design a measurement proposal for the completeness of knowledge graphs.
基金the Beijing Municipal Science and Technology Program(No.Z231100001323004).
文摘Utilizing graph neural networks for knowledge embedding to accomplish the task of knowledge graph completion(KGC)has become an important research area in knowledge graph completion.However,the number of nodes in the knowledge graph increases exponentially with the depth of the tree,whereas the distances of nodes in Euclidean space are second-order polynomial distances,whereby knowledge embedding using graph neural networks in Euclidean space will not represent the distances between nodes well.This paper introduces a novel approach called hyperbolic hierarchical graph attention network(H2GAT)to rectify this limitation.Firstly,the paper conducts knowledge representation in the hyperbolic space,effectively mitigating the issue of exponential growth of nodes with tree depth and consequent information loss.Secondly,it introduces a hierarchical graph atten-tion mechanism specifically designed for the hyperbolic space,allowing for enhanced capture of the network structure inherent in the knowledge graph.Finally,the efficacy of the proposed H2GAT model is evaluated on benchmark datasets,namely WN18RR and FB15K-237,thereby validating its effectiveness.The H2GAT model achieved 0.445,0.515,and 0.586 in the Hits@1,Hits@3 and Hits@10 metrics respectively on the WN18RR dataset and 0.243,0.367 and 0.518 on the FB15K-237 dataset.By incorporating hyperbolic space embedding and hierarchical graph attention,the H2GAT model successfully addresses the limitations of existing hyperbolic knowledge embedding models,exhibiting its competence in knowledge graph completion tasks.
基金Supported by the National Natural Science Foundation of China(No.61876144)。
文摘To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.
基金Supported by the National Natural Science Foundation of China(No.61876144).
文摘Knowledge graph(KG) link prediction aims to address the problem of missing multiple valid triples in KGs. Existing approaches either struggle to efficiently model the message passing process of multi-hop paths or lack transparency of model prediction principles. In this paper,a new graph convolutional network path semantic-aware graph convolution network(PSGCN) is proposed to achieve modeling the semantic information of multi-hop paths. PSGCN first uses a random walk strategy to obtain all-hop paths in KGs,then captures the semantics of the paths by Word2Sec and long shortterm memory(LSTM) models,and finally converts them into a potential representation for the graph convolution network(GCN) messaging process. PSGCN combines path-based inference methods and graph neural networks to achieve better interpretability and scalability. In addition,to ensure the robustness of the model,the value of the path thresholdKis experimented on the FB15K-237 and WN18RR datasets,and the final results prove the effectiveness of the model.
基金supported partly by National Natural Science Foundation of China(Nos.62372119 and 62166003)the Project of Guangxi Science and Technology(Nos.GuiKeAB23026040 and GuiKeAD20159041)+3 种基金the Innovation Project of Guangxi Graduate Education(No.YCSW2023188)Key Lab of Education Blockchain and Intelligent Technology,Ministry of Education,Guangxi Normal University,Guilin,China,Intelligent Processing and the Research Fund of Guangxi Key Lab of Multi-source Information Mining&Security(Nos.20-A-01-01 and MIMS21-M01)Open Research Fund of Guangxi Key Lab of Humanmachine Interaction and Intelligent Decision(No.GXHIID2206)the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and the Guangxi“Bagui”Teams for Innovation and Research,China.
文摘Knowledge representation learning(KRL)aims to encode entities and relationships in various knowledge graphs into low-dimensional continuous vectors.It is popularly used in knowledge graph completion(or link prediction)tasks.Translation-based knowledge representation learning methods perform well in knowledge graph completion(KGC).However,the translation principles adopted by these methods are too strict and cannot model complex entities and relationships(i.e.,N-1,1-N,and N-N)well.Besides,these traditional translation principles are primarily used in static knowledge graphs and overlook the temporal properties of triplet facts.Therefore,we propose a temporal knowledge graph embedding model based on variable translation(TKGE-VT).The model proposes a new variable translation principle,which enables flexible transformation between entities and relationship embedding.Meanwhile,this paper considers the temporal properties of both entities and relationships and applies the proposed principle of variable translation to temporal knowledge graphs.We conduct link prediction and triplet classification experiments on four benchmark datasets:WN11,WN18,FB13,and FB15K.Our model outperforms baseline models on multiple evaluation metrics according to the experimental results.
文摘In the link prediction task of knowledge graph completion,Graph Neural Network(GNN)-based knowledge graph completion models have been shown by previous studies to produce large improvements in prediction results.However,many of the previous efforts were limited to aggregating the information given by neighboring nodes and did not take advantage of the information provided by the edges represented by relations.To address the problem,Coupling Relation Strength with Graph Convolutional Networks(RS-GCN)is proposed,which is a model with an encoder-decoder framework to realize the embedding of entities and relations in the vector space.On the encoder side,RS-GCN captures graph structure and neighborhood information while aggregating the information given by neighboring nodes.On the decoder side,RotatE is utilized to model and infer various relational patterns.The models are evaluated on standard FB15k,WN18,FB15k-237 and WN18RR datasets,and the experiments show that RS-GCN achieves better results than the current state-of-the-art classical models on the above knowledge graph datasets.
基金National Natural Science Foundation of China(NSFC)key project(No.61533018,No.U1736204 and No.61661146007)Ministry of Education and China Mobile Research Fund(No.20181770250)and THUNUS NExT Co-Lab.
文摘Knowledge bases(KBs)are often greatly incomplete,necessitating a demand for KB completion.Although XLORE is an English-Chinese bilingual knowledge graph,there are only 423,974 cross-lingual links between English instances and Chinese instances.We present XLORE2,an extension of the XLORE that is built automatically from Wikipedia,Baidu Baike and Hudong Baike.We add more facts by making cross-lingual knowledge linking,cross-lingual property matching and fine-grained type inference.We also design an entity linking system to demonstrate the effectiveness and broad coverage of XLORE2.
基金Supported by Science and Technology Innovation Action Plan"of Shanghai Science and Technology Commission for Social Development Project(21DZ1204900)。
文摘With the development of information fusion,knowledge graph completion tasks have received a lot of attention.some studies investigate the broader underlying problems of linguistics,while embedding learning has a narrow focus.This poses significant challenges due to the heterogeneity of coarse-graining patterns.Then,to settle the whole matter,a framework for completion is designed,named Triple Encoder-Scoring Module(TEsm).The model employs an alternating two-branch structure that fuses local features into the interaction pattern of the triplet itself by perfectly combining distance and structure models.Moreover,it is mapped to a uniform shared space.Upon completion,an ensemble inference method is proposed to query multiple predictions from different graphs using a weight classifier.Experiments show that the experimental dataset used for the completion task is DBpedia,which contains five different linguistic subsets..Our extensive experimental results demonstrate that TEsm can efficiently and smoothly solve the optimal completion task,validating the performance of the proposed model.
基金the National Natural Science Foundation of China under Grant No.61991412the Program for HUST Academic Frontier Youth Team under Grant No.2018QYTD07.
文摘Knowledge graphs are involved in more and more applications to further improve intelligence.Owing to the inherent incompleteness of knowledge graphs resulted from data updating and missing,a number of knowledge graph completion models are proposed in succession.To obtain better performance,many methods are of high complexity,making it time-consuming for training and inference.This paper proposes a simple but e®ective model using only shallow neural networks,which combines enhanced feature interaction and multi-subspace information integration.In the enhanced feature interaction module,entity and relation embeddings are almost peer-to-peer interacted via multi-channel 2D convolution.In the multi-subspace information integration module,entity and relation embeddings are projected to multiple subspaces to extract multi-view information to further boost performance.Extensive experiments on widely used datasets show that the proposed model outperforms a series of strong baselines.And ablation studies demonstrate the e®ectiveness of each submodule in the model.
基金supported by the National Natural Science Foundation of China(No.U19A2059)the 2022 Research Foundation of Chengdu Textile College(No.X22032161).
文摘Currently,most existing inductive relation prediction approaches are based on subgraph structures,with subgraph features extracted using graph neural networks to predict relations.However,subgraphs may contain disconnected regions,which usually represent different semantic ranges.Because not all semantic information about the regions is helpful in relation prediction,we propose a relation prediction model based on a disentangled subgraph structure and implement a feature updating approach based on relevant semantic aggregation.To indirectly achieve the disentangled subgraph structure from a semantic perspective,the mapping of entity features into different semantic spaces and the aggregation of related semantics on each semantic space are updated.The disentangled model can focus on features having higher semantic relevance in the prediction,thus addressing a problem with existing approaches,which ignore the semantic differences in different subgraph structures.Furthermore,using a gated recurrent neural network,this model enhances the features of entities by sorting them by distance and extracting the path information in the subgraphs.Experimentally,it is shown that when there are numerous disconnected regions in the subgraph,our model outperforms existing mainstream models in terms of both Area Under the Curve-Precision-Recall(AUC-PR)and Hits@10.Experiments prove that semantic differences in the knowledge graph can be effectively distinguished and verify the effectiveness of this method.
文摘The Knowledge Graph(KGs)have profoundly impacted many researchfields.However,there is a problem of low data integrity in KGs.The binary-relational knowledge graph is more common in KGs but is limited by less information.It often has less content to use when predicting missing entities(relations).The hyper-relational knowledge graph is another form of KGs,which introduces much additional information(qualifiers)based on the main triple.The hyper-relational knowledge graph can effectively improve the accuracy of pre-dicting missing entities(relations).The existing hyper-relational link prediction methods only consider the overall perspective when dealing with qualifiers and calculate the score function by combining the qualifiers with the main triple.How-ever,these methods overlook the inherent characteristics of entities and relations.This paper proposes a novel Local and Global Hyper-relation Aggregation Embed-ding for Link Prediction(LGHAE).LGHAE can capture the semantic features of hyper-relational data from local and global perspectives.To fully utilize local and global features,Hyper-InteractE,as a new decoder,is designed to predict missing entities to fully utilize local and global features.We validated the feasibility of LGHAE by comparing it with state-of-the-art models on public datasets.