Entity relation classification aims to classify the semantic relationship between two marked entities in a given sentence,and plays a vital role in various natural language processing applications.However,existing stu...Entity relation classification aims to classify the semantic relationship between two marked entities in a given sentence,and plays a vital role in various natural language processing applications.However,existing studies focus on exploiting mono-lingual data in English,due to the lack of labeled data in other languages.How to effectively benefit from a richly-labeled language to help a poorly-labeled language is still an open problem.In this paper,we come up with a language adaptation framework for cross-lingual entity relation classification.The basic idea is to employ adversarial neural networks(AdvNN)to transfer feature representations from one language to another.Especially,such a language adaptation framework enables feature imitation via the competition between a sentence encoder and a rival language discriminator to generate effective representations.To verify the effectiveness of AdvNN,we introduce two kinds of adversarial structures,dual-channel AdvNN and single-channel AdvNN.Experimental results on the ACE 2005 multilingual training corpus show that our single-channel AdvNN achieves the best performance on both unsupervised and semi-supervised scenarios,yielding an improvement of 6.61%and 2.98%over the state-of-the-art,respectively.Compared with baselines which directly adopt a machine translation module,we find that both dual-channel and single-channel AdvNN significantly improve the performances(F1)of cross-lingual entity relation classification.Moreover,extensive analysis and discussion demonstrate the appropriateness and effectiveness of different parameter settings in our language adaptation framework.展开更多
Entity relation extraction(ERE)is an important task in the field of information extraction.With the wide application of pre-training language model(PLM)in natural language processing(NLP),using PLM has become a brand ...Entity relation extraction(ERE)is an important task in the field of information extraction.With the wide application of pre-training language model(PLM)in natural language processing(NLP),using PLM has become a brand new research direction of ERE.In this paper,BERT is used to extracting entityrelations,and a separated pipeline architecture is proposed.ERE was decomposed into entity-relation classification sub-task and entity-pair annotation sub-task.Both sub-tasks conduct the pre-training and fine-tuning independently.Combining dynamic and static masking,newVerb-MLM and Entity-MLM BERT pre-training tasks were put forward to enhance the correlation between BERT pre-training and TargetedNLPdownstream task-ERE.Inter-layer sharing attentionmechanismwas added to the model,sharing the attention parameters according to the similarity of the attention matrix.Contrast experiment on the SemEavl 2010 Task8 dataset demonstrates that the new MLM task and inter-layer sharing attention mechanism improve the performance of BERT on the entity relation extraction effectively.展开更多
Due to the structural dependencies among concurrent events in the knowledge graph and the substantial amount of sequential correlation information carried by temporally adjacent events,we propose an Independent Recurr...Due to the structural dependencies among concurrent events in the knowledge graph and the substantial amount of sequential correlation information carried by temporally adjacent events,we propose an Independent Recurrent Temporal Graph Convolution Networks(IndRT-GCNets)framework to efficiently and accurately capture event attribute information.The framework models the knowledge graph sequences to learn the evolutionary represen-tations of entities and relations within each period.Firstly,by utilizing the temporal graph convolution module in the evolutionary representation unit,the framework captures the structural dependency relationships within the knowledge graph in each period.Meanwhile,to achieve better event representation and establish effective correlations,an independent recurrent neural network is employed to implement auto-regressive modeling.Furthermore,static attributes of entities in the entity-relation events are constrained andmerged using a static graph constraint to obtain optimal entity representations.Finally,the evolution of entity and relation representations is utilized to predict events in the next subsequent step.On multiple real-world datasets such as Freebase13(FB13),Freebase 15k(FB15K),WordNet11(WN11),WordNet18(WN18),FB15K-237,WN18RR,YAGO3-10,and Nell-995,the results of multiple evaluation indicators show that our proposed IndRT-GCNets framework outperforms most existing models on knowledge reasoning tasks,which validates the effectiveness and robustness.展开更多
Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for man...Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for many tasks. However, the performances of the methods drop sharply when we extend the type taxonomy to a fine-grained one with several hundreds of types. In this paper, we introduce a hybrid neural network model for type classification of entity mentions with a fine-grained taxonomy. There are four components in our model, namely, the entity mention component, the context component, the relation component, the already known type component, which are used to extract features from the target entity mention, context, relations and already known types of the entity mentions in surrounding context respectively. The learned features by the four components are concatenated and fed into a softmax layer to predict the type distribution. We carried out extensive experiments to evaluate our proposed model. Experimental results demonstrate that our model achieves state-of-the-art performance on the FIGER dataset. Moreover, we extracted larger datasets from Wikipedia and DBpedia. On the larger datasets, our model achieves the comparable performance to the state-of-the-art methods with the coarse-grained type taxonomy, but performs much better than those methods with the fine-grained type taxonomy in terms of micro-F1, macro-F1 and weighted-F1.展开更多
Aiming at the lack of classification and good standard corpus in the task of joint entity and relationship extraction in the current Chinese academic field, this paper builds a dataset in management science that can b...Aiming at the lack of classification and good standard corpus in the task of joint entity and relationship extraction in the current Chinese academic field, this paper builds a dataset in management science that can be used for joint entity and relationship extraction, and establishes a deep learning model to extract entity and relationship information from scientific texts. With the definition of entity and relation classification, we build a Chinese scientific text corpus dataset based on the abstract texts of projects funded by the National Natural Science Foundation of China(NSFC) in 2018–2019. By combining the word2vec features with the clue word feature which is a kind of special style in scientific documents, we establish a joint entity relationship extraction model based on the Bi LSTM-CNN-CRF model for scientific information extraction. The dataset we constructed contains 13060 entities(not duplicated) and 9728 entity relation labels. In terms of entity prediction effect, the accuracy rate of the constructed model reaches 69.15%, the recall rate reaches 61.03%, and the F1 value reaches 64.83%. In terms of relationship prediction effect, the accuracy rate is higher than that of entity prediction, which reflects the effectiveness of the input mixed features and the integration of local features with CNN layer in the model.展开更多
Knowledge bases(KBs)are far from complete,necessitating a demand for KB completion.Among various methods,embedding has received increasing attention in recent years.PTransE,an important approach using embedding method...Knowledge bases(KBs)are far from complete,necessitating a demand for KB completion.Among various methods,embedding has received increasing attention in recent years.PTransE,an important approach using embedding method in KB completion,considers multiple-step relation paths based on TransE,but ignores the association between entity and their related entities with the same direct relationships.In this paper,we propose an approach called EP-TransE,which considers this kind of association.As a matter of fact,the dissimilarity of these related entities should be taken into consideration and it should not exceed a certain threshold.EPTransE adjusts the embedding vector of an entity by comparing it with its related entities which are connected by the same direct relationship.EPTransE further makes the euclidean distance between them less than a certain threshold.Therefore,the embedding vectors of entities are able to contain rich semantic information,which is valuable for KB completion.In experiments,we evaluated our approach on two tasks,including entity prediction and relation prediction.Experimental results show that our idea of considering the dissimilarity of related entities with the same direct relationships is effective.展开更多
基金This work was supported by the National Natural Science Foundation of China under Grant Nos.61703293,61751206,and 61672368.
文摘Entity relation classification aims to classify the semantic relationship between two marked entities in a given sentence,and plays a vital role in various natural language processing applications.However,existing studies focus on exploiting mono-lingual data in English,due to the lack of labeled data in other languages.How to effectively benefit from a richly-labeled language to help a poorly-labeled language is still an open problem.In this paper,we come up with a language adaptation framework for cross-lingual entity relation classification.The basic idea is to employ adversarial neural networks(AdvNN)to transfer feature representations from one language to another.Especially,such a language adaptation framework enables feature imitation via the competition between a sentence encoder and a rival language discriminator to generate effective representations.To verify the effectiveness of AdvNN,we introduce two kinds of adversarial structures,dual-channel AdvNN and single-channel AdvNN.Experimental results on the ACE 2005 multilingual training corpus show that our single-channel AdvNN achieves the best performance on both unsupervised and semi-supervised scenarios,yielding an improvement of 6.61%and 2.98%over the state-of-the-art,respectively.Compared with baselines which directly adopt a machine translation module,we find that both dual-channel and single-channel AdvNN significantly improve the performances(F1)of cross-lingual entity relation classification.Moreover,extensive analysis and discussion demonstrate the appropriateness and effectiveness of different parameter settings in our language adaptation framework.
基金Hainan Province High level talent project of basic and applied basic research plan(Natural Science Field)in 2019(No.2019RC100)Haikou City Key Science and Technology Plan Project(2020–049)Hainan Province Key Research and Development Project(ZDYF2020018).
文摘Entity relation extraction(ERE)is an important task in the field of information extraction.With the wide application of pre-training language model(PLM)in natural language processing(NLP),using PLM has become a brand new research direction of ERE.In this paper,BERT is used to extracting entityrelations,and a separated pipeline architecture is proposed.ERE was decomposed into entity-relation classification sub-task and entity-pair annotation sub-task.Both sub-tasks conduct the pre-training and fine-tuning independently.Combining dynamic and static masking,newVerb-MLM and Entity-MLM BERT pre-training tasks were put forward to enhance the correlation between BERT pre-training and TargetedNLPdownstream task-ERE.Inter-layer sharing attentionmechanismwas added to the model,sharing the attention parameters according to the similarity of the attention matrix.Contrast experiment on the SemEavl 2010 Task8 dataset demonstrates that the new MLM task and inter-layer sharing attention mechanism improve the performance of BERT on the entity relation extraction effectively.
基金the National Natural Science Founda-tion of China(62062062)hosted by Gulila Altenbek.
文摘Due to the structural dependencies among concurrent events in the knowledge graph and the substantial amount of sequential correlation information carried by temporally adjacent events,we propose an Independent Recurrent Temporal Graph Convolution Networks(IndRT-GCNets)framework to efficiently and accurately capture event attribute information.The framework models the knowledge graph sequences to learn the evolutionary represen-tations of entities and relations within each period.Firstly,by utilizing the temporal graph convolution module in the evolutionary representation unit,the framework captures the structural dependency relationships within the knowledge graph in each period.Meanwhile,to achieve better event representation and establish effective correlations,an independent recurrent neural network is employed to implement auto-regressive modeling.Furthermore,static attributes of entities in the entity-relation events are constrained andmerged using a static graph constraint to obtain optimal entity representations.Finally,the evolution of entity and relation representations is utilized to predict events in the next subsequent step.On multiple real-world datasets such as Freebase13(FB13),Freebase 15k(FB15K),WordNet11(WN11),WordNet18(WN18),FB15K-237,WN18RR,YAGO3-10,and Nell-995,the results of multiple evaluation indicators show that our proposed IndRT-GCNets framework outperforms most existing models on knowledge reasoning tasks,which validates the effectiveness and robustness.
文摘Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for many tasks. However, the performances of the methods drop sharply when we extend the type taxonomy to a fine-grained one with several hundreds of types. In this paper, we introduce a hybrid neural network model for type classification of entity mentions with a fine-grained taxonomy. There are four components in our model, namely, the entity mention component, the context component, the relation component, the already known type component, which are used to extract features from the target entity mention, context, relations and already known types of the entity mentions in surrounding context respectively. The learned features by the four components are concatenated and fed into a softmax layer to predict the type distribution. We carried out extensive experiments to evaluate our proposed model. Experimental results demonstrate that our model achieves state-of-the-art performance on the FIGER dataset. Moreover, we extracted larger datasets from Wikipedia and DBpedia. On the larger datasets, our model achieves the comparable performance to the state-of-the-art methods with the coarse-grained type taxonomy, but performs much better than those methods with the fine-grained type taxonomy in terms of micro-F1, macro-F1 and weighted-F1.
基金Supported by the National Natural Science Foundation of China (71804017)the R&D Program of Beijing Municipal Education Commission (KZ202210005013)the Sichuan Social Science Planning Project (SC22B151)。
文摘Aiming at the lack of classification and good standard corpus in the task of joint entity and relationship extraction in the current Chinese academic field, this paper builds a dataset in management science that can be used for joint entity and relationship extraction, and establishes a deep learning model to extract entity and relationship information from scientific texts. With the definition of entity and relation classification, we build a Chinese scientific text corpus dataset based on the abstract texts of projects funded by the National Natural Science Foundation of China(NSFC) in 2018–2019. By combining the word2vec features with the clue word feature which is a kind of special style in scientific documents, we establish a joint entity relationship extraction model based on the Bi LSTM-CNN-CRF model for scientific information extraction. The dataset we constructed contains 13060 entities(not duplicated) and 9728 entity relation labels. In terms of entity prediction effect, the accuracy rate of the constructed model reaches 69.15%, the recall rate reaches 61.03%, and the F1 value reaches 64.83%. In terms of relationship prediction effect, the accuracy rate is higher than that of entity prediction, which reflects the effectiveness of the input mixed features and the integration of local features with CNN layer in the model.
基金This work was supported by the National Key Research and Development Plan of China(2017YFD0400101)the National Natural Science Foundation of China(Grant No.61502294)the Natural Science Foundation of Shanghai,Project Number(16ZR1411200).
文摘Knowledge bases(KBs)are far from complete,necessitating a demand for KB completion.Among various methods,embedding has received increasing attention in recent years.PTransE,an important approach using embedding method in KB completion,considers multiple-step relation paths based on TransE,but ignores the association between entity and their related entities with the same direct relationships.In this paper,we propose an approach called EP-TransE,which considers this kind of association.As a matter of fact,the dissimilarity of these related entities should be taken into consideration and it should not exceed a certain threshold.EPTransE adjusts the embedding vector of an entity by comparing it with its related entities which are connected by the same direct relationship.EPTransE further makes the euclidean distance between them less than a certain threshold.Therefore,the embedding vectors of entities are able to contain rich semantic information,which is valuable for KB completion.In experiments,we evaluated our approach on two tasks,including entity prediction and relation prediction.Experimental results show that our idea of considering the dissimilarity of related entities with the same direct relationships is effective.