期刊文献+

基于RoBERTa和多层次特征的中文事件抽取方法

Chinese event extraction method based on RoBERTa and multi-level feature
下载PDF
导出
摘要 针对中文事件抽取中语义表征不充分、特征提取不全面等问题,提出一种基于RoBERTa和多层次特征的中文事件抽取方法。通过RoBERTa预训练模型构建字向量,并基于词性标注和触发词语义信息融入进行字向量扩展;其次使用双向长短时记忆网络和卷积神经网络抽取全局特征和局部特征,并通过自注意力机制捕捉不同特征之间的关联,加强对重要特征的利用;最后通过条件随机场实现BIO序列标注,完成事件抽取。在DuEE1.0数据集上,触发词抽取和事件论元抽取的F1值达到86.9%和68.0%,优于现有常用事件抽取模型,验证了该方法的有效性。 To address the issues of insufficient semantic representation and incomplete feature extraction in Chinese event extrac‐tion,a method based on RoBERTa and multi-level features is proposed.Firstly,by using the pre-trained RoBERTa model,word embeddings are constructed and extended based on syntactic and semantic information of trigger words.Specifically,part-of-speech tags and trigger word embeddings are integrated into the word embeddings.Secondly,global and local features are ex‐tracted using a bi-directional long short-term memory network and convolutional neural network,respectively.The self-attention mechanism is employed to capture the relationships among different features,emphasizing the utilization of important features.Fi‐nally,a conditional random field is used to achieve BIO sequence labeling,completing the event extraction process.On the DuEE1.0 dataset,the F1 scores of trigger word extraction and event argument extraction reach 86.9%and 68.0%,respectively,which are superior to existing common event extraction models,validating the effectiveness of this method.
作者 乐杨 胡军国 李耀 Le Yang;Hu Junguo;Li Yao(College of Mathematics and Computer Science,Zhejiang Agriculture&Forestry University,Hangzhou 311300,China)
出处 《电子技术应用》 2023年第11期49-54,共6页 Application of Electronic Technique
基金 国家自然科学基金项目(31971493)。
关键词 事件抽取 RoBERTa预训练模型 多层次特征 自注意力机制 序列标注 event extraction RoBERTa pretrained model multi-level feature self-attention mechanism sequence labeling
  • 相关文献

参考文献6

二级参考文献25

共引文献2360

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部