摘要
水利工程抢险措施是防汛应急预案的重要组成部分。本文旨在运用信息抽取技术从各种无结构文本源中抽取出水利工程抢险知识,并将其转化为〈实体,关系,实体〉三元组结构,为应急预案智能生成提供结构化知识支撑。本文将异构的水利工程抢险实体抽取和关系抽取视为从序列到序列的生成任务,并提出了基于大型预训练语言模型(T5)的水利工程抢险实体和关系联合抽取框架(Water Project Rescue Entities and Relationships Joint Extraction,WRERJE)。WRERJE是同时进行实体抽取和关系抽取的多任务框架,该框架使用动态提示引导T5进行实体和关系的联合抽取。研究特定于水利工程抢险领域的文本数据增强方法,在使用少量标注样本对WRERJE进行初步微调的基础上,通过数据增强方法获得更多描述模糊但标注正确的数据进一步微调WRERJE,提高了其对水利工程抢险实体和关系抽取的性能。通过实验对WRERJE性能进行评估,结果表明在水利工程抢险实体和关系联合抽取任务上,WRERJE表现出了较高的抽取性能(实体和关系抽取F1值分别达到78.42%、78.22%),验证了动态提示和联合抽取方法的有效性。
Water project rescue measures are an important part of flood prevention emergency plan.This article aims to use information extraction technology to extract water project rescue knowledge from various unstructured text sources,and transform it into a triple structure of〈entity,relationship,entity〉,and provide structured knowledge support for intelligent generation of emergency plans.The heterogeneous water project rescue entity extraction and relationship extraction tasks are considered as sequence-to-sequence generation tasks,and water project rescue entities and relationships joint extraction(WRERJE)framework based on large language models is proposed.WRERJE is a multitasking framework for both entity extraction and relationship extraction,which uses dynamic prompts to guide T5 for joint extraction of entities and relationships.The text data augmentation method specific to the field of water project rescue is studied,and on the basis of the preliminary fine-tuning of WRERJE by using a small number of labeled samples,WRERJE is further fine-tuned by using more vaguely described but correctly labeled data are obtained by data augmentation method,improving its performance for extracting water project rescue entities and relationships.The performance of WRERJE is evaluated experimentally,and the results show that WRERJE shows high extraction performance in the task of water project rescue entity extraction and relationship extraction(F 1 values of entity and relationship reach 78.42%and 78.22%,respectively),which verifies the effectiveness of dynamic prompt and joint extraction methods.
作者
杨阳蕊
朱亚萍
刘雪梅
陈思思
李慧敏
YANG Yangrui;ZHU Yaping;LIU Xuemei;CHEN Sisi;LI Huimin(School of Information Engineering,North China University of Water Resources and Electric Power,Zhengzhou 450000,China;Collaborative Innovation Center for Efficient Utilization of Water Resources,Zhengzhou 450000,China;School of Water Conservancy,North China University of Water Resources and Electric Power,Zhengzhou 450000,China)
出处
《水利学报》
EI
CSCD
北大核心
2023年第7期818-828,共11页
Journal of Hydraulic Engineering
基金
国家自然科学基金项目(72271091)
河南省科学院科技开放合作项目(220901008)。
关键词
水利工程抢险
应急预案
信息抽取
动态提示
联合抽取
文本数据增强
water project rescue
emergency plan
information extraction
dynamic prompt
joint extraction
text data augmentation