摘要
序列到序列(seq2seq)的框架可以应用到抽象语义表示(AMR)解析任务中,把AMR解析当作一个从源端句子到目标端AMR图的翻译任务。然而,以前的工作通常把源端句子表示为一个单词序列,忽略了句子内部潜藏的句法和语义角色信息。基于seq2seq框架,该文提出了一个直接而有效的融合句法和语义角色信息的AMR解析方法。实验结果表明,该文的方法在AMR英文标准数据集上取得了6.7%的显著提升。最后,该文从多个角度深入分析了源端的句法和语义角色信息是如何对AMR解析提供帮助的。分析表明,词性信息和subword技术对AMR解析性能提升的贡献最大,上层句法和语义角色信息次之。
Sequence-to-sequence(seq2 seq)approaches formalize AMR parsing as a translation task from a source sentence to a target AMR graph.However,previous studies generally model a source sentence as a word sequence but ignore the inherent syntactic and semantic roles information.In this paper,we propose a straightforward yet effective approach to incorporate syntactic and semantic roles information of the source sentence into seq2 seq based AMR parsing.Experimental results show that our approach achieves significant improvement of 6.7% F1 score on an English benchmark dataset.Further indepth analysis from various perspectives is provided to reveal how source syntactic and semantic roles information benefits AMR parsing.Experimental analysis also reveals that POS information and segmenting words into subwords make the more contribution to the improvement,followed by other syntax and semantic roles.
作者
葛东来
李军辉
朱慕华
李寿山
周国栋
GE Donglai;LI Junhui;ZHU Muhua;LI Shoushan;ZHOU Guodong(School of Computer Science and Technology .Soochow University .Suzhou, Jiangsu 215006,China;Alibaba Group,Hangzhou,Zhejiang 311121,China)
出处
《中文信息学报》
CSCD
北大核心
2019年第8期36-45,共10页
Journal of Chinese Information Processing
基金
国家自然科学基金(61502149,61876120)