Relation Extraction(RE)is to obtain a predefined relation type of two entities mentioned in a piece of text,e.g.,a sentence-level or a document-level text.Most existing studies suffer from the noise in the text,and ne...Relation Extraction(RE)is to obtain a predefined relation type of two entities mentioned in a piece of text,e.g.,a sentence-level or a document-level text.Most existing studies suffer from the noise in the text,and necessary pruning is of great importance.The conventional sentence-level RE task addresses this issue by a denoising method using the shortest dependency path to build a long-range semantic dependency between entity pairs.However,this kind of denoising method is scarce in document-level RE.In this work,we explicitly model a denoised document-level graph based on linguistic knowledge to capture various long-range semantic dependencies among entities.We first formalize a Syntactic Dependency Tree forest(SDT-forest)by introducing the syntax and discourse dependency relation.Then,the Steiner tree algorithm extracts a mention-level denoised graph,Steiner Graph(SG),removing linguistically irrelevant words from the SDT-forest.We then devise a slide residual attention to highlight word-level evidence on text and SG.Finally,the classification is established on the SG to infer the relations of entity pairs.We conduct extensive experiments on three public datasets.The results evidence that our method is beneficial to establish long-range semantic dependency and can improve the classification performance with longer texts.展开更多
基金supported by the National Natural Science Foundation of China(Nos.U19A2059&62176046).
文摘Relation Extraction(RE)is to obtain a predefined relation type of two entities mentioned in a piece of text,e.g.,a sentence-level or a document-level text.Most existing studies suffer from the noise in the text,and necessary pruning is of great importance.The conventional sentence-level RE task addresses this issue by a denoising method using the shortest dependency path to build a long-range semantic dependency between entity pairs.However,this kind of denoising method is scarce in document-level RE.In this work,we explicitly model a denoised document-level graph based on linguistic knowledge to capture various long-range semantic dependencies among entities.We first formalize a Syntactic Dependency Tree forest(SDT-forest)by introducing the syntax and discourse dependency relation.Then,the Steiner tree algorithm extracts a mention-level denoised graph,Steiner Graph(SG),removing linguistically irrelevant words from the SDT-forest.We then devise a slide residual attention to highlight word-level evidence on text and SG.Finally,the classification is established on the SG to infer the relations of entity pairs.We conduct extensive experiments on three public datasets.The results evidence that our method is beneficial to establish long-range semantic dependency and can improve the classification performance with longer texts.