摘要
端到端实体关系抽取任务可以被分解成命名实体识别和关系抽取两个子任务,最近的工作多将这两个子任务联合建模。现有的流水线方法验证了在关系模型中融合实体类型信息的重要性和管道模型的潜力,但是它们忽略了文本中的某些实体可能同时具有多个类型,这种多义性的情况在中文数据集中尤为常见。为解决上述问题,提出了一种实体级联类型机制,并在此基础上开发了一个更适合中文关系抽取的管道模型,取名为CENTRELINE。该流水线方法的实体模块是一个词-词关系分类模型,它以BERT和双向LSTM作为编码器、经过条件层归一化后引入空洞卷积,最后通过级联类型预测器输出实体及其级联类型。关系模块的输入仅由实体模块构建。该方法在DuIE1.0、DuIE2.0和CMeIE-V2数据集上的F_(1)值分别比基线方法提高了7.23、6.93和8.51百分点,并在DuIE1.0和DuIE2.0数据集上都实现了最先进的性能。消融实验表明,提出的级联类型机制和根据中文语言特征改进的管道模型,均对关系抽取性能具有明显的促进作用。
End-to-end entity relation extraction can be decomposed into named entity recognition and relation extraction,most recent works model these two subtasks jointly.Existing pipelined approaches validate the importance of fusing entity type information in the relation model and the potential of pipeline models,but they ignore the possibility that certain entities in the text may have multiple types at the same time,which is particularly common in Chinese datasets.This paper proposed an entity cascading type mechanism to address the aforementioned issues and developed a pipeline model named CENTRELINE,which was more suitable for Chinese relation extraction.This pipelined approach incorporated an entity module,which was a word-word relation classification model.It employed BERT and bi-directional LSTM as encoders,introduced dilated convolution after conditional layer normalization,and finally generated outputs for entities and their cascading types using a cascading type predictor.The input of the relation module was only constructed by the entity module.The F_(1)values of this method surpass the baseline by 7.23%,6.93%,and 8.51%on DuIE1.0,DuIE2.0,and CMeIE-V2 datasets,respectively.This method achieves state-of-the-art performance on both DuIE1.0 and DuIE2.0 datasets.The results of ablation experiments indicate that both the proposed cascading type mechanism and the pipeline model refined based on Chinese language characteristics can enhance the performance of relation extraction.
作者
饶东宁
吴倩梅
黄观琚
Rao Dongning;Wu Qianmei;Huang Guanju(School of Computers,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机应用研究》
CSCD
北大核心
2024年第9期2685-2689,共5页
Application Research of Computers
基金
广东省自然科学基金面上项目(2021A1515012556)。
关键词
中文关系抽取
管道模型
空洞卷积
实体级联类型
Chinese relation extraction
pipeline model
dilated convolution
entity cascading type