摘要
篇章级别的服务事件序列抽取任务旨在发现给定服务的相关文本中所有服务事件的顺序序列关系,构建得到一组按照服务事件发生顺序排列的服务事件集合,其研究可以广泛应用于知识图谱构建、自动问答等任务.与该任务相关的现有工作分过程抽取和事件时序关系抽取两类:过程抽取相关研究默认事件真实发生的顺序与文本描述的顺序一致,忽略了许多非过程性文本中事件发生的顺序与文本描述顺序不一致的情况;.事件时序关系抽取的相关研究往往关注事件对之间的时序关系判断,无法建模所有事件的顺序序列关系.针对以上问题,提出一种基于多粒度信息编码和联合优化的篇章级服务事件序列抽取方法,使用多粒度信息编码模块获得服务文本中具有丰富语义信息的服务事件向量表示,再利用联合优化模块提取服务事件之间的顺序序列关系,得到篇章级别的服务事件序列.由于没有公开数据集可以直接用于服务事件序列抽取任务的评估,抽取基于事件时序关系抽取的公开数据集TimeBank(TB),AQUAINT(AQ),Platinum(PL)和MATRES中的数据,构建了可用于篇章级服务事件序列抽取任务评估的数据集,实验结果证明了提出方法的有效性.
The task of extracting a sequence of service events at the document level aims to discover the sequential relationship of all service events in given service-related texts,and to construct a set of service events arranged in order of occurrence.The research can be widely applied to tasks such as knowledge graph construction and automatic question answering.Existing works related to this task can be divided into two categories:process extraction and event temporal relation extraction.Researches on process extraction assume that the true order of events is consistent with the order of text description,ignoring the fact that in many non'process texts,the order of events may not be consistent with the description order.Related researches on event temporal relation extraction often focuse on judging the temporal relation between event pairs and cannot model the sequential relationship of all events.A document-level service event sequence extraction method based on multi-granularity information coding and joint optimization is proposed to solve above problems.A multi-granularity information coding module is used to learn the vector representation of service events in the service text.Then,a joint optimization module is used to extract the service event sequence relation to obtain the document-level service event sequence.Considering that there is no public dataset directly used to evaluate the service event sequence extraction task,this paper constructs a dataset based on the event temporal relation extraction public datasets TimeBank(TB),AQUAINT(AQ),Platinum(PL)and MATRES.Experimental results show the effectiveness of the method proposed in this paper.
作者
程钦男
莫志强
曹斌
范菁
单宇翔
Cheng Qinnan;Mo Zhiqiang;Cao Bin;Fan Jing;Shan Yuxiang(College of Computer Science&Technology,Zhejiang University of Technology,Hangzhou,310023,China;Information Center,China Tobacco Zhejiang Industrial Co.,Ltd.,Hangzhou,310009,China)
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第3期460-470,共11页
Journal of Nanjing University(Natural Science)
基金
国家自然基金(62276233)
浙江省科技计划(2023C01048)。
关键词
服务文本
服务事件
序列抽取
多粒度编码
联合优化
service text
service event
sequence extraction
multi-granularity coding
joint optimization