摘要
准确地梳理古文典籍脉络,抽取典籍中蕴含的事件和事件论元,对古籍从文本数据向智能化数据转化具有重要意义。针对古文事件的抽取研究主要有基于模式匹配、机器学习和神经网络三种方式,本文在现有的基于神经网络的方法中融入机器阅读理解模式,将事件抽取中出现的“事件类型”和“论元角色”糅合为问题形式,由此输出的答案即为事件论元。分别选取编年体史书《左传》和纪传体史书《史记》作为训练和泛化的数据,在具体的泛化过程中引入混淆句以验证模型效果,为古文事件抽取提供了可参照的思路。
Exploring the context of ancient Chinese classics and extracting the events and event arguments contained in ancient Chinese classics are critical to read and understand the content of the text quickly.At present,research on event extractions from ancient books is mainly based on pattern matching,machine learning,and neural networks.This paper integrates the machine reading understanding mode into the existing neural network-based methods and combines the“event type”and“argument role”in event extraction into the form of questions so that the answer is event argument.Zuo Zhuan(in annalistic style)and The Historical Records(in annal-biography style)are selected as the training and generalization data,respectively,and the confused sentences are introduced in the specific generalization process to verify the effect of the model,which provides a reference idea for ancient Chinese event extraction.
作者
喻雪寒
何琳
王献琪
Yu Xuehan;He Lin;Wang Xianqi(College of Information Management,Nanjing Agricultural University,Nanjing 210095;Research Center for Humanities and Social Computing,Nanjing Agricultural University,Nanjing 210095)
出处
《情报学报》
CSCD
北大核心
2023年第3期316-326,共11页
Journal of the China Society for Scientific and Technical Information
基金
国家社会科学基金一般项目“基于典籍的中华传统文化知识表达体系自动构建方法研究”(18BTQ063)。
关键词
古籍文本
机器阅读理解
事件抽取
RoBERTa
混淆句
ancient books
machine reading comprehension
event extraction
RoBERTa
confused sentences