Identification of underlying partial differential equations(PDEs)for complex systems remains a formidable challenge.In the present study,a robust PDE identification method is proposed,demonstrating the ability to extr...Identification of underlying partial differential equations(PDEs)for complex systems remains a formidable challenge.In the present study,a robust PDE identification method is proposed,demonstrating the ability to extract accurate governing equations under noisy conditions without prior knowledge.Specifically,the proposed method combines gene expression programming,one type of evolutionary algorithm capable of generating unseen terms based solely on basic operators and functional terms,with symbolic regression neural networks.These networks are designed to represent explicit functional expressions and optimize them with data gradients.In particular,the specifically designed neural networks can be easily transformed to physical constraints for the training data,embedding the discovered PDEs to further optimize the metadata used for iterative PDE identification.The proposed method has been tested in four canonical PDE cases,validating its effectiveness without preliminary information and confirming its suitability for practical applications across various noise levels.展开更多
Utilizing graph neural networks for knowledge embedding to accomplish the task of knowledge graph completion(KGC)has become an important research area in knowledge graph completion.However,the number of nodes in the k...Utilizing graph neural networks for knowledge embedding to accomplish the task of knowledge graph completion(KGC)has become an important research area in knowledge graph completion.However,the number of nodes in the knowledge graph increases exponentially with the depth of the tree,whereas the distances of nodes in Euclidean space are second-order polynomial distances,whereby knowledge embedding using graph neural networks in Euclidean space will not represent the distances between nodes well.This paper introduces a novel approach called hyperbolic hierarchical graph attention network(H2GAT)to rectify this limitation.Firstly,the paper conducts knowledge representation in the hyperbolic space,effectively mitigating the issue of exponential growth of nodes with tree depth and consequent information loss.Secondly,it introduces a hierarchical graph atten-tion mechanism specifically designed for the hyperbolic space,allowing for enhanced capture of the network structure inherent in the knowledge graph.Finally,the efficacy of the proposed H2GAT model is evaluated on benchmark datasets,namely WN18RR and FB15K-237,thereby validating its effectiveness.The H2GAT model achieved 0.445,0.515,and 0.586 in the Hits@1,Hits@3 and Hits@10 metrics respectively on the WN18RR dataset and 0.243,0.367 and 0.518 on the FB15K-237 dataset.By incorporating hyperbolic space embedding and hierarchical graph attention,the H2GAT model successfully addresses the limitations of existing hyperbolic knowledge embedding models,exhibiting its competence in knowledge graph completion tasks.展开更多
Purpose:Due to the incompleteness nature of knowledge graphs(KGs),the task of predicting missing links between entities becomes important.Many previous approaches are static,this posed a notable problem that all meani...Purpose:Due to the incompleteness nature of knowledge graphs(KGs),the task of predicting missing links between entities becomes important.Many previous approaches are static,this posed a notable problem that all meanings of a polysemous entity share one embedding vector.This study aims to propose a polysemous embedding approach,named KG embedding under relational contexts(ContE for short),for missing link prediction.Design/methodology/approach:ContE models and infers different relationship patterns by considering the context of the relationship,which is implicit in the local neighborhood of the relationship.The forward and backward impacts of the relationship in ContE are mapped to two different embedding vectors,which represent the contextual information of the relationship.Then,according to the position of the entity,the entity’s polysemous representation is obtained by adding its static embedding vector to the corresponding context vector of the relationship.Findings:ContE is a fully expressive,that is,given any ground truth over the triples,there are embedding assignments to entities and relations that can precisely separate the true triples from false ones.ContE is capable of modeling four connectivity patterns such as symmetry,antisymmetry,inversion and composition.Research limitations:ContE needs to do a grid search to find best parameters to get best performance in practice,which is a time-consuming task.Sometimes,it requires longer entity vectors to get better performance than some other models.Practical implications:ContE is a bilinear model,which is a quite simple model that could be applied to large-scale KGs.By considering contexts of relations,ContE can distinguish the exact meaning of an entity in different triples so that when performing compositional reasoning,it is capable to infer the connectivity patterns of relations and achieves good performance on link prediction tasks.Originality/value:ContE considers the contexts of entities in terms of their positions in triples and the relationships they link to.It decomposes a relation vector into two vectors,namely,forward impact vector and backward impact vector in order to capture the relational contexts.ContE has the same low computational complexity as TransE.Therefore,it provides a new approach for contextualized knowledge graph embedding.展开更多
Accurate prediction of future events brings great benefits and reduces losses for society in many domains,such as civil unrest,pandemics,and crimes.Knowledge graph is a general language for describing and modeling com...Accurate prediction of future events brings great benefits and reduces losses for society in many domains,such as civil unrest,pandemics,and crimes.Knowledge graph is a general language for describing and modeling complex systems.Different types of events continually occur,which are often related to historical and concurrent events.In this paper,we formalize the future event prediction as a temporal knowledge graph reasoning problem.Most existing studies either conduct reasoning on static knowledge graphs or assume knowledges graphs of all timestamps are available during the training process.As a result,they cannot effectively reason over temporal knowledge graphs and predict events happening in the future.To address this problem,some recent works learn to infer future events based on historical eventbased temporal knowledge graphs.However,these methods do not comprehensively consider the latent patterns and influences behind historical events and concurrent events simultaneously.This paper proposes a new graph representation learning model,namely Recurrent Event Graph ATtention Network(RE-GAT),based on a novel historical and concurrent events attention-aware mechanism by modeling the event knowledge graph sequence recurrently.More specifically,our RE-GAT uses an attention-based historical events embedding module to encode past events,and employs an attention-based concurrent events embedding module to model the associations of events at the same timestamp.A translation-based decoder module and a learning objective are developed to optimize the embeddings of entities and relations.We evaluate our proposed method on four benchmark datasets.Extensive experimental results demonstrate the superiority of our RE-GAT model comparing to various base-lines,which proves that our method can more accurately predict what events are going to happen.展开更多
Link prediction,also known as Knowledge Graph Completion(KGC),is the common task in Knowledge Graphs(KGs)to predict missing connections between entities.Most existing methods focus on designing shallow,scalable models...Link prediction,also known as Knowledge Graph Completion(KGC),is the common task in Knowledge Graphs(KGs)to predict missing connections between entities.Most existing methods focus on designing shallow,scalable models,which have less expressive than deep,multi-layer models.Furthermore,most operations like addition,matrix multiplications or factorization are handcrafted based on a few known relation patterns in several wellknown datasets,such as FB15k,WN18,etc.However,due to the diversity and complex nature of real-world data distribution,it is inherently difficult to preset all latent patterns.To address this issue,we proposeKGE-ANS,a novel knowledge graph embedding framework for general link prediction tasks using automatic network search.KGEANS can learn a deep,multi-layer effective architecture to adapt to different datasets through neural architecture search.In addition,the general search spacewe designed is tailored forKGtasks.We performextensive experiments on benchmark datasets and the dataset constructed in this paper.The results show that our KGE-ANS outperforms several state-of-the-art methods,especially on these datasets with complex relation patterns.展开更多
Knowledge graphs(KGs)have been widely accepted as powerful tools for modeling the complex relationships between concepts and developing knowledge-based services.In recent years,researchers in the field of power system...Knowledge graphs(KGs)have been widely accepted as powerful tools for modeling the complex relationships between concepts and developing knowledge-based services.In recent years,researchers in the field of power systems have explored KGs to develop intelligent dispatching systems for increasingly large power grids.With multiple power grid dispatching knowledge graphs(PDKGs)constructed by different agencies,the knowledge fusion of different PDKGs is useful for providing more accurate decision supports.To achieve this,entity alignment that aims at connecting different KGs by identifying equivalent entities is a critical step.Existing entity alignment methods cannot integrate useful structural,attribute,and relational information while calculating entities’similarities and are prone to making many-to-one alignments,thus can hardly achieve the best performance.To address these issues,this paper proposes a collective entity alignment model that integrates three kinds of available information and makes collective counterpart assignments.This model proposes a novel knowledge graph attention network(KGAT)to learn the embeddings of entities and relations explicitly and calculates entities’similarities by adaptively incorporating the structural,attribute,and relational similarities.Then,we formulate the counterpart assignment task as an integer programming(IP)problem to obtain one-to-one alignments.We not only conduct experiments on a pair of PDKGs but also evaluate o ur model on three commonly used cross-lingual KGs.Experimental comparisons indicate that our model outperforms other methods and provides an effective tool for the knowledge fusion of PDKGs.展开更多
To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge gra...To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.展开更多
Knowledge tracing is the key component in online individualized learning,which is capable of assessing the users'mastery of skills and predicting the probability that the users can solve specific problems.Availabl...Knowledge tracing is the key component in online individualized learning,which is capable of assessing the users'mastery of skills and predicting the probability that the users can solve specific problems.Available knowledge tracing models have the problem that the assessments are not directly used in the predictions.To make full use of the assessments during predictions,a novel model,named deep knowledge tracing embedding neural network(DKTENN),is proposed in this work.DKTENN is a synthesis of deep knowledge tracing(DKT)and knowledge graph embedding(KGE).DKT utilizes sophisticated long short-term memory(LSTM)to assess the users and track the mastery of skills according to the users'interaction sequences with skill-level tags,and KGE is applied to predict the probability on the basis of both the embedded problems and DKT's assessments.DKTENN outperforms performance factors analysis and the other knowledge tracing models based on deep learning in the experiments.展开更多
Machine reading comprehension has been a research focus in natural language processing and intelligence engineering.However,there is a lack of models and datasets for the MRC tasks in the anti-terrorism domain.Moreove...Machine reading comprehension has been a research focus in natural language processing and intelligence engineering.However,there is a lack of models and datasets for the MRC tasks in the anti-terrorism domain.Moreover,current research lacks the ability to embed accurate background knowledge and provide precise answers.To address these two problems,this paper first builds a text corpus and testbed that focuses on the anti-terrorism domain in a semi-automatic manner.Then,it proposes a knowledge-based machine reading comprehension model that fuses domain-related triples from a large-scale encyclopedic knowledge base to enhance the semantics of the text.To eliminate knowledge noise that could lead to semantic deviation,this paper uses a mixed mutual ttention mechanism among questions,passages,and knowledge triples to select the most relevant triples before embedding their semantics into the sentences.Experiment results indicate that the proposed approach can achieve a 70.70%EM value and an 87.91%F1 score,with a 4.23%and 3.35%improvement over existing methods,respectively.展开更多
When training a large-scale knowledge graph embedding(KGE)model with multiple graphics processing units(GPUs),the partition-based method is necessary for parallel training.However,existing partition-based training met...When training a large-scale knowledge graph embedding(KGE)model with multiple graphics processing units(GPUs),the partition-based method is necessary for parallel training.However,existing partition-based training methods suffer from low GPU utilization and high input/output(IO)overhead between the memory and disk.For a high IO overhead between the disk and memory problem,we optimized the twice partitioning with fine-grained GPU scheduling to reduce the IO overhead between the CPU memory and disk.For low GPU utilization caused by the GPU load imbalance problem,we proposed balanced partitioning and dynamic scheduling methods to accelerate the training speed in different cases.With the above methods,we proposed fine-grained partitioning KGE,an efficient KGE training framework with multiple GPUs.We conducted experiments on some benchmarks of the knowledge graph,and the results show that our method achieves speedup compared to existing framework on the training of KGE.展开更多
News recommendation system is designed to deal with massive news and provide personalized recommendations for users.Accurately capturing user preferences and modeling news and users is the key to news recommendation.I...News recommendation system is designed to deal with massive news and provide personalized recommendations for users.Accurately capturing user preferences and modeling news and users is the key to news recommendation.In this paper,we propose a new framework,news recommendation system based on topic embedding and knowledge embedding(NRTK).NRTK handle news titles that users have clicked on from two perspectives to obtain news and user representation embedding:1)extracting explicit and latent topic features from news and mining users’preferences for them in historical behaviors;2)extracting entities and propagating users’potential preferences in the knowledge graph.Experiments in a real-world dataset validate the effectiveness and efficiency of our approach.展开更多
Knowledge graph embedding, which maps the entities and relations into low-dimensional vector spaces, has demonstrated its effectiveness in many tasks such as link prediction and relation extraction. Typical methods in...Knowledge graph embedding, which maps the entities and relations into low-dimensional vector spaces, has demonstrated its effectiveness in many tasks such as link prediction and relation extraction. Typical methods include TransE, TransH, and TransR. All these methods map different relations into the vector space separately and the intrinsic correlations of these relations are ignored. It is obvious that there exist some correlations among relations because different relations may connect to a common entity. For example, the triples (Steve Jobs, PlaceOfBrith, California) and (Apple Inc., Location, California) share the same entity California as their tail entity. We analyze the embedded relation matrices learned by TransE/TransH/TransR, and find that the correlations of relations do exist and they are showed as low-rank structure over the embedded relation matrix. It is natural to ask whether we can leverage these correlations to learn better embeddings for the entities and relations in a knowledge graph. In this paper, we propose to learn the embedded relation matrix by decomposing it as a product of two low-dimensional matrices, for characterizing the low-rank structure. The proposed method, called TransCoRe (Translation-Based Method via Modeling the Correlations of Relations), learns the embeddings of entities and relations with translation-based framework. Experimental results based on the benchmark datasets of WordNet and Freebase demonstrate that our method outperforms the typical baselines on link prediction and triple classification tasks.展开更多
COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question a...COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering(QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge(i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset(called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph Open KG-COVID-19 launched by Open KG1, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.展开更多
Knowledge graph representation has been a long standing goal of artificial intelligence. In this paper,we consider a method for knowledge graph embedding of hyper-relational data, which are commonly found in knowledge...Knowledge graph representation has been a long standing goal of artificial intelligence. In this paper,we consider a method for knowledge graph embedding of hyper-relational data, which are commonly found in knowledge graphs. Previous models such as Trans(E, H, R) and CTrans R are either insufficient for embedding hyper-relational data or focus on projecting an entity into multiple embeddings, which might not be effective for generalization nor accurately reflect real knowledge. To overcome these issues, we propose the novel model Trans HR, which transforms the hyper-relations in a pair of entities into an individual vector, serving as a translation between them. We experimentally evaluate our model on two typical tasks—link prediction and triple classification.The results demonstrate that Trans HR significantly outperforms Trans(E, H, R) and CTrans R, especially for hyperrelational data.展开更多
As a joint-optimization problem which simultaneously fulfills two different but correlated embedding tasks (i.e., entity embedding and relation embedding), knowledge embedding problem is solved in a joint embedding ...As a joint-optimization problem which simultaneously fulfills two different but correlated embedding tasks (i.e., entity embedding and relation embedding), knowledge embedding problem is solved in a joint embedding scheme. In this embedding scheme, we design a joint compatibility scoring function to quantitatively evaluate the relational facts with respect to entities and relations, and further incorporate the scoring function into the maxmargin structure learning process that explicitly learns the embedding vectors of entities and relations using the context information of the knowledge base. By optimizing the joint problem, our design is capable of effectively capturing the intrinsic topological structures in the learned embedding spaces. Experimental results demonstrate the effectiveness of our embedding scheme in characterizing the semantic correlations among different relation units, and in relation prediction for knowledge inference.展开更多
Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search i...Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search in databases.However,due to a lack of unified naming standards across prevalent information systems(a.k.a.information islands),AST identification still remains as an open problem.To tackle this problem,we propose a context-aware method to figure out the ASTs for relations in this paper.We transform the AST identification into a multi-class classification problem and propose a schema context aware(SCA)model to learn the representation from a collection of relations associated with attribute values and schema context.Based on the learned representation,we predict the AST for a given attribute from an underlying relation,wherein the predicted AST is mapped to one of the labeled ASTs.To improve the performance for AST identification,especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs,we then introduce knowledge base embeddings(a.k.a.KBVec)to enhance the above representation and construct a schema context aware model with knowledge base enhanced(SCA-KB)to get a stable and robust model.Extensive experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin,up to 6.14%and 25.17%in terms of macro average F1 score,and up to 0.28%and 9.56%in terms of weighted F1 score over high-quality and low-quality datasets respectively.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.92152102 and 92152202)the Advanced Jet Propulsion Innovation Center/AEAC(Grant No.HKCX2022-01-010)。
文摘Identification of underlying partial differential equations(PDEs)for complex systems remains a formidable challenge.In the present study,a robust PDE identification method is proposed,demonstrating the ability to extract accurate governing equations under noisy conditions without prior knowledge.Specifically,the proposed method combines gene expression programming,one type of evolutionary algorithm capable of generating unseen terms based solely on basic operators and functional terms,with symbolic regression neural networks.These networks are designed to represent explicit functional expressions and optimize them with data gradients.In particular,the specifically designed neural networks can be easily transformed to physical constraints for the training data,embedding the discovered PDEs to further optimize the metadata used for iterative PDE identification.The proposed method has been tested in four canonical PDE cases,validating its effectiveness without preliminary information and confirming its suitability for practical applications across various noise levels.
基金the Beijing Municipal Science and Technology Program(No.Z231100001323004).
文摘Utilizing graph neural networks for knowledge embedding to accomplish the task of knowledge graph completion(KGC)has become an important research area in knowledge graph completion.However,the number of nodes in the knowledge graph increases exponentially with the depth of the tree,whereas the distances of nodes in Euclidean space are second-order polynomial distances,whereby knowledge embedding using graph neural networks in Euclidean space will not represent the distances between nodes well.This paper introduces a novel approach called hyperbolic hierarchical graph attention network(H2GAT)to rectify this limitation.Firstly,the paper conducts knowledge representation in the hyperbolic space,effectively mitigating the issue of exponential growth of nodes with tree depth and consequent information loss.Secondly,it introduces a hierarchical graph atten-tion mechanism specifically designed for the hyperbolic space,allowing for enhanced capture of the network structure inherent in the knowledge graph.Finally,the efficacy of the proposed H2GAT model is evaluated on benchmark datasets,namely WN18RR and FB15K-237,thereby validating its effectiveness.The H2GAT model achieved 0.445,0.515,and 0.586 in the Hits@1,Hits@3 and Hits@10 metrics respectively on the WN18RR dataset and 0.243,0.367 and 0.518 on the FB15K-237 dataset.By incorporating hyperbolic space embedding and hierarchical graph attention,the H2GAT model successfully addresses the limitations of existing hyperbolic knowledge embedding models,exhibiting its competence in knowledge graph completion tasks.
基金supported by the Key R&D Program Project of Zhejiang Province under Grant no.2019 C01004 and 2021C02004.
文摘Purpose:Due to the incompleteness nature of knowledge graphs(KGs),the task of predicting missing links between entities becomes important.Many previous approaches are static,this posed a notable problem that all meanings of a polysemous entity share one embedding vector.This study aims to propose a polysemous embedding approach,named KG embedding under relational contexts(ContE for short),for missing link prediction.Design/methodology/approach:ContE models and infers different relationship patterns by considering the context of the relationship,which is implicit in the local neighborhood of the relationship.The forward and backward impacts of the relationship in ContE are mapped to two different embedding vectors,which represent the contextual information of the relationship.Then,according to the position of the entity,the entity’s polysemous representation is obtained by adding its static embedding vector to the corresponding context vector of the relationship.Findings:ContE is a fully expressive,that is,given any ground truth over the triples,there are embedding assignments to entities and relations that can precisely separate the true triples from false ones.ContE is capable of modeling four connectivity patterns such as symmetry,antisymmetry,inversion and composition.Research limitations:ContE needs to do a grid search to find best parameters to get best performance in practice,which is a time-consuming task.Sometimes,it requires longer entity vectors to get better performance than some other models.Practical implications:ContE is a bilinear model,which is a quite simple model that could be applied to large-scale KGs.By considering contexts of relations,ContE can distinguish the exact meaning of an entity in different triples so that when performing compositional reasoning,it is capable to infer the connectivity patterns of relations and achieves good performance on link prediction tasks.Originality/value:ContE considers the contexts of entities in terms of their positions in triples and the relationships they link to.It decomposes a relation vector into two vectors,namely,forward impact vector and backward impact vector in order to capture the relational contexts.ContE has the same low computational complexity as TransE.Therefore,it provides a new approach for contextualized knowledge graph embedding.
基金supported by the National Natural Science Foundation of China under grants U19B2044National Key Research and Development Program of China(2021YFC3300500).
文摘Accurate prediction of future events brings great benefits and reduces losses for society in many domains,such as civil unrest,pandemics,and crimes.Knowledge graph is a general language for describing and modeling complex systems.Different types of events continually occur,which are often related to historical and concurrent events.In this paper,we formalize the future event prediction as a temporal knowledge graph reasoning problem.Most existing studies either conduct reasoning on static knowledge graphs or assume knowledges graphs of all timestamps are available during the training process.As a result,they cannot effectively reason over temporal knowledge graphs and predict events happening in the future.To address this problem,some recent works learn to infer future events based on historical eventbased temporal knowledge graphs.However,these methods do not comprehensively consider the latent patterns and influences behind historical events and concurrent events simultaneously.This paper proposes a new graph representation learning model,namely Recurrent Event Graph ATtention Network(RE-GAT),based on a novel historical and concurrent events attention-aware mechanism by modeling the event knowledge graph sequence recurrently.More specifically,our RE-GAT uses an attention-based historical events embedding module to encode past events,and employs an attention-based concurrent events embedding module to model the associations of events at the same timestamp.A translation-based decoder module and a learning objective are developed to optimize the embeddings of entities and relations.We evaluate our proposed method on four benchmark datasets.Extensive experimental results demonstrate the superiority of our RE-GAT model comparing to various base-lines,which proves that our method can more accurately predict what events are going to happen.
基金supported in part by the Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-006.
文摘Link prediction,also known as Knowledge Graph Completion(KGC),is the common task in Knowledge Graphs(KGs)to predict missing connections between entities.Most existing methods focus on designing shallow,scalable models,which have less expressive than deep,multi-layer models.Furthermore,most operations like addition,matrix multiplications or factorization are handcrafted based on a few known relation patterns in several wellknown datasets,such as FB15k,WN18,etc.However,due to the diversity and complex nature of real-world data distribution,it is inherently difficult to preset all latent patterns.To address this issue,we proposeKGE-ANS,a novel knowledge graph embedding framework for general link prediction tasks using automatic network search.KGEANS can learn a deep,multi-layer effective architecture to adapt to different datasets through neural architecture search.In addition,the general search spacewe designed is tailored forKGtasks.We performextensive experiments on benchmark datasets and the dataset constructed in this paper.The results show that our KGE-ANS outperforms several state-of-the-art methods,especially on these datasets with complex relation patterns.
基金supported by the National Key R&D Program of China(2018AAA0101502)the Science and Technology Project of SGCC(State Grid Corporation of China):Fundamental Theory of Human-in-the-Loop Hybrid-Augmented Intelligence for Power Grid Dispatch and Control。
文摘Knowledge graphs(KGs)have been widely accepted as powerful tools for modeling the complex relationships between concepts and developing knowledge-based services.In recent years,researchers in the field of power systems have explored KGs to develop intelligent dispatching systems for increasingly large power grids.With multiple power grid dispatching knowledge graphs(PDKGs)constructed by different agencies,the knowledge fusion of different PDKGs is useful for providing more accurate decision supports.To achieve this,entity alignment that aims at connecting different KGs by identifying equivalent entities is a critical step.Existing entity alignment methods cannot integrate useful structural,attribute,and relational information while calculating entities’similarities and are prone to making many-to-one alignments,thus can hardly achieve the best performance.To address these issues,this paper proposes a collective entity alignment model that integrates three kinds of available information and makes collective counterpart assignments.This model proposes a novel knowledge graph attention network(KGAT)to learn the embeddings of entities and relations explicitly and calculates entities’similarities by adaptively incorporating the structural,attribute,and relational similarities.Then,we formulate the counterpart assignment task as an integer programming(IP)problem to obtain one-to-one alignments.We not only conduct experiments on a pair of PDKGs but also evaluate o ur model on three commonly used cross-lingual KGs.Experimental comparisons indicate that our model outperforms other methods and provides an effective tool for the knowledge fusion of PDKGs.
基金Supported by the National Natural Science Foundation of China(No.61876144)。
文摘To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.
文摘Knowledge tracing is the key component in online individualized learning,which is capable of assessing the users'mastery of skills and predicting the probability that the users can solve specific problems.Available knowledge tracing models have the problem that the assessments are not directly used in the predictions.To make full use of the assessments during predictions,a novel model,named deep knowledge tracing embedding neural network(DKTENN),is proposed in this work.DKTENN is a synthesis of deep knowledge tracing(DKT)and knowledge graph embedding(KGE).DKT utilizes sophisticated long short-term memory(LSTM)to assess the users and track the mastery of skills according to the users'interaction sequences with skill-level tags,and KGE is applied to predict the probability on the basis of both the embedded problems and DKT's assessments.DKTENN outperforms performance factors analysis and the other knowledge tracing models based on deep learning in the experiments.
基金National key research and development program(2020AAA0108500)National Natural Science Foundation of China Project(No.U1836118)Key Laboratory of Rich Media Digital Publishing,Content Organization and Knowledge Service(No.:ZD2022-10/05).
文摘Machine reading comprehension has been a research focus in natural language processing and intelligence engineering.However,there is a lack of models and datasets for the MRC tasks in the anti-terrorism domain.Moreover,current research lacks the ability to embed accurate background knowledge and provide precise answers.To address these two problems,this paper first builds a text corpus and testbed that focuses on the anti-terrorism domain in a semi-automatic manner.Then,it proposes a knowledge-based machine reading comprehension model that fuses domain-related triples from a large-scale encyclopedic knowledge base to enhance the semantics of the text.To eliminate knowledge noise that could lead to semantic deviation,this paper uses a mixed mutual ttention mechanism among questions,passages,and knowledge triples to select the most relevant triples before embedding their semantics into the sentences.Experiment results indicate that the proposed approach can achieve a 70.70%EM value and an 87.91%F1 score,with a 4.23%and 3.35%improvement over existing methods,respectively.
文摘When training a large-scale knowledge graph embedding(KGE)model with multiple graphics processing units(GPUs),the partition-based method is necessary for parallel training.However,existing partition-based training methods suffer from low GPU utilization and high input/output(IO)overhead between the memory and disk.For a high IO overhead between the disk and memory problem,we optimized the twice partitioning with fine-grained GPU scheduling to reduce the IO overhead between the CPU memory and disk.For low GPU utilization caused by the GPU load imbalance problem,we proposed balanced partitioning and dynamic scheduling methods to accelerate the training speed in different cases.With the above methods,we proposed fine-grained partitioning KGE,an efficient KGE training framework with multiple GPUs.We conducted experiments on some benchmarks of the knowledge graph,and the results show that our method achieves speedup compared to existing framework on the training of KGE.
基金Supported by the Key Research&Development Projects in Hubei Province(2022BAA041 and 2021BCA124)the Open Foundation of Engineering Research Center of Cyberspace(KJAQ202112002)。
文摘News recommendation system is designed to deal with massive news and provide personalized recommendations for users.Accurately capturing user preferences and modeling news and users is the key to news recommendation.In this paper,we propose a new framework,news recommendation system based on topic embedding and knowledge embedding(NRTK).NRTK handle news titles that users have clicked on from two perspectives to obtain news and user representation embedding:1)extracting explicit and latent topic features from news and mining users’preferences for them in historical behaviors;2)extracting entities and propagating users’potential preferences in the knowledge graph.Experiments in a real-world dataset validate the effectiveness and efficiency of our approach.
基金This work was supported by the National Basic Research 973 Program of China under Grant No. 2014CB340405, the National Key Research and Development Program of China under Grant No. 2016YFB1000902, and the National Natural Science Foundation of China under Grant Nos. 61402442, 61272177, 61173008, 61232010, 61303244, 61572469, 91646120 and 61572473.
文摘Knowledge graph embedding, which maps the entities and relations into low-dimensional vector spaces, has demonstrated its effectiveness in many tasks such as link prediction and relation extraction. Typical methods include TransE, TransH, and TransR. All these methods map different relations into the vector space separately and the intrinsic correlations of these relations are ignored. It is obvious that there exist some correlations among relations because different relations may connect to a common entity. For example, the triples (Steve Jobs, PlaceOfBrith, California) and (Apple Inc., Location, California) share the same entity California as their tail entity. We analyze the embedded relation matrices learned by TransE/TransH/TransR, and find that the correlations of relations do exist and they are showed as low-rank structure over the embedded relation matrix. It is natural to ask whether we can leverage these correlations to learn better embeddings for the entities and relations in a knowledge graph. In this paper, we propose to learn the embedded relation matrix by decomposing it as a product of two low-dimensional matrices, for characterizing the low-rank structure. The proposed method, called TransCoRe (Translation-Based Method via Modeling the Correlations of Relations), learns the embeddings of entities and relations with translation-based framework. Experimental results based on the benchmark datasets of WordNet and Freebase demonstrate that our method outperforms the typical baselines on link prediction and triple classification tasks.
基金supported by the Fundamental Research Funds for the Central Universities with grant Nos.22120220069the National Nature Science Foundation of China with Grant No.62176185supported in part by the Shanghai Artificial Intelligence Innovation and Development Fund grant 2020RGZN-02026
文摘COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering(QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge(i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset(called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph Open KG-COVID-19 launched by Open KG1, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.
基金partially supported by the National Natural Science Foundation of China(Nos.61302077,61520106007,61421061,and 61602048)
文摘Knowledge graph representation has been a long standing goal of artificial intelligence. In this paper,we consider a method for knowledge graph embedding of hyper-relational data, which are commonly found in knowledge graphs. Previous models such as Trans(E, H, R) and CTrans R are either insufficient for embedding hyper-relational data or focus on projecting an entity into multiple embeddings, which might not be effective for generalization nor accurately reflect real knowledge. To overcome these issues, we propose the novel model Trans HR, which transforms the hyper-relations in a pair of entities into an individual vector, serving as a translation between them. We experimentally evaluate our model on two typical tasks—link prediction and triple classification.The results demonstrate that Trans HR significantly outperforms Trans(E, H, R) and CTrans R, especially for hyperrelational data.
基金Project supported by the National Basic Research Program (973) of China (No. 2015CB352302) and the National Natural Science Foundation of China (Nos. U1509206 and 61472353)
文摘As a joint-optimization problem which simultaneously fulfills two different but correlated embedding tasks (i.e., entity embedding and relation embedding), knowledge embedding problem is solved in a joint embedding scheme. In this embedding scheme, we design a joint compatibility scoring function to quantitatively evaluate the relational facts with respect to entities and relations, and further incorporate the scoring function into the maxmargin structure learning process that explicitly learns the embedding vectors of entities and relations using the context information of the knowledge base. By optimizing the joint problem, our design is capable of effectively capturing the intrinsic topological structures in the learned embedding spaces. Experimental results demonstrate the effectiveness of our embedding scheme in characterizing the semantic correlations among different relation units, and in relation prediction for knowledge inference.
基金supported by the National Key Research and Development Program of China under Grant No.2020YFB2104100the National Natural Science Foundation of China under Grant Nos.61972403 and U1711261the Fundamental Research Funds for the Central Universities of China,the Research Funds of Renmin University of China,and Tencent Rhino-Bird Joint Research Program.
文摘Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search in databases.However,due to a lack of unified naming standards across prevalent information systems(a.k.a.information islands),AST identification still remains as an open problem.To tackle this problem,we propose a context-aware method to figure out the ASTs for relations in this paper.We transform the AST identification into a multi-class classification problem and propose a schema context aware(SCA)model to learn the representation from a collection of relations associated with attribute values and schema context.Based on the learned representation,we predict the AST for a given attribute from an underlying relation,wherein the predicted AST is mapped to one of the labeled ASTs.To improve the performance for AST identification,especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs,we then introduce knowledge base embeddings(a.k.a.KBVec)to enhance the above representation and construct a schema context aware model with knowledge base enhanced(SCA-KB)to get a stable and robust model.Extensive experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin,up to 6.14%and 25.17%in terms of macro average F1 score,and up to 0.28%and 9.56%in terms of weighted F1 score over high-quality and low-quality datasets respectively.