期刊文献+

基于动态句法剪枝机制的中文语义角色标注 被引量:2

Chinese Semantic Role Labeling Based on Dynamic Syntax Pruning
下载PDF
导出
摘要 语义角色标注(Semantic Role Labeling,SRL)旨在识别给定句子中所包含的谓词及对应的语义论元,从而为信息抽取、自动问答和阅读理解等任务的语义理解提供帮助.构建句法特征作为实现语义角色标注任务的关键步骤,在很大程度上影响着任务的性能.针对现有的神经网络模型未能有效构建句法特征,例如现有研究采取离线式的人工定式句法裁剪方案,不可避免地造成关键句法信息丢失或者裁剪效果减弱等问题,本文提出基于动态句法剪枝机制的端到端神经网络模型,并将其用于中文语义角色标注任务.具体地,我们提出两种创新的动态句法剪枝机制:基于递归神经网络模型的动态句法剪枝机制(Recur-DSP)和基于带句法标签的图卷积网络模型的句法剪枝机制(SGCN-DSP).Recur-DSP采用递归神经网络模型进行句法结构编码与融合,并对句法树的每一个连接处通过Gumbel-Softmax函数离散化实现动态句法裁剪.SGCN-DSP采用图卷积神经网络模型为句法依存树的依存弧结构以及对应的标签进行统一建模,并提出对应的动态句法裁剪机制.在基准数据集上的实验结果显示所提方法超过当前的最好模型,获得当前中文语义角色标注的最优性能.通过整合预训练语言模型BERT,基于CoNLL09数据集,提出的模型SGCN-DSP在角色论元识别上获得了90.4%的F1值,在谓词识别上获得90.8%的F1值. Semantic role labeling(SRL),as the shallow semantic parsing task,which has received extensive research attention in recent years and plays a core role in the natural language processing(NLP)community.The SRL task aims to identify the corresponding argument roles for the predicates of a given sentence,which can facilitate the downstream NLP tasks,such as information extraction,question answer system and reading comprehension,etc.A great number of methods have been proposed for the task,and the existing studies can be divided into two main categories:machine learning based methods with hand-crafted discrete features and deep learning methods with automatic distributed features.The early studies largely separate SRL into two individual subtasks,i.e.,predicate disambiguation and argument role labeling.More recently,great efforts have been paid for constructing various end-to-end SRL architectures,solving two pipeline steps in one shot via one unified model.Recent studies also show that integrating external syntactic features,such as syntactic dependency trees,are important for the SRL task highly.So designing a novel neural model,which can capture syntactic features effectively,has become a heated research topic.Recently,He et al.(2018)find that only a part of syntactic structure information can offer valuable information for the SRL task,which calls for pruning the syntactic structure features.However,the existing work adopts the offline syntactic pruning strategy,which can inevitably lead to either the loss of key syntactic information or the weakening of pruning effectiveness.Extracting syntactic features,as an important step of the SRL task,will largely affect the final performance of the task.However,the existing neural network methods fail to effectively model syntactic features.For example,the existing studies adopt the offline syntactic pruning strategy with fixed human labor,which inevitably leads to the loss of key syntactic information or the weakening of pruning effectiveness.To address the above issues,we propose an end-to-end neural network model for the Chinese SRL task based on dynamic syntactic pruning mechanism.Specifically,we propose two novel methods:recursive neural network model with dynamic syntactic pruning(Recur-DSP)and syntax-label graph convolutional network with dynamic syntactic pruning(SGCN-DSP).Recur-DSP uses a recursive neural network model to encode and fuse syntactic structure knowledge,and applies the Gumbel-Softmax function to realize dynamic syntactic pruning.SGCN-DSP exploits a graph convolutional neural network model that can simultaneously encode syntactic arcs and labels,based on which we introduce the corresponding dynamic syntactic pruning strategy.Experimental results on multiple benchmark datasets show the effectiveness of the proposed methods.Our proposed methods outperform the current best method by a large margin,giving the state-of-the-art performances for the Chinese SRL task.Specifically,our proposed model SGCN-DSP achieves 86.9%F1 score in argument role labeling and 89.1%F1 score in predicate identification based on the CoNLL09 dataset.By integrating the current pre-trained language model BERT(Bidirectional Encoder Representation from Transformers,BERT),the task performance can be further improved.The proposed SGCN-DSP gives 90.4%F1 score in argument role labeling,and 90.8%F1 scores in predicate identification,respectively.
作者 费豪 姬东鸿 任亚峰 FEI Hao;JI Dong-Hong;REN Ya-Feng(School of Cyber Science and Engineering,Wuhan University,Wuhan 430072;School of Interpreting and Translation Studies,Guangdong University of Foreign Studies,Guangzhou 510420)
出处 《计算机学报》 EI CAS CSCD 北大核心 2022年第8期1746-1764,共19页 Chinese Journal of Computers
基金 国家重点研发计划项目(2017YFC1200500) 国家自然科学基金项目(61702121,61772378) 广州市科技计划项目(202102020607)资助.
关键词 自然语言处理 语义角色标注 句法剪枝 神经网络 深度学习 natural language processing semantic role labeling syntax pruning neural network deep learning
  • 相关文献

参考文献3

二级参考文献18

  • 1刘挺,车万翔,李生.基于最大熵分类器的语义角色标注[J].软件学报,2007,18(3):565-573. 被引量:73
  • 2CoNLL 2008.http://www.yr-bcn.es/conll2008/.
  • 3CoNLL 2009.http://ufal.mff.cuni.cz/conll2009-st/.
  • 4Xue N,Palmer M.Annotating the propositions in the Penn Chinese Treebank[C] //Proc.Of the 2nd SIGHAN Workshop on Chinese Language Processing.
  • 5Xue N.Annotating the predicate-argumènt structure of Chinese nominalizations[C] //Proc.of LREC 2006.
  • 6Pradhan S,Sun H,Ward W,et al.Parsing arguments of nominalizations in English and Chinese[C] //Proc.of NAACL-HIT 2004.
  • 7Xue N,Palmer M.Automatic semantic role labeling for Chinese verbs[C] //Proc.of IJCAI 2005.
  • 8Xue N.Semantic role labeling of nominalized predicates in Chinese[C] //Proc.of HLT-NAACL 2006.
  • 9Xue N.Labeling Chinese predicates with semantic roles[J].Computational Linguistics,2008,34(2):225-255.
  • 10Junhui Li,Guodong Zhou,Hai Zhao,et al.Improving Nominal SRL in Chinese Language with Verbal SRL Information and Automatic Predicate Recognition[C] //Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing,Singapore,6-7 August 2009.ACL and AFNLP 2009:1280-1288.

共引文献11

同被引文献33

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部