摘要
该文提出了一个基于多层长短期记忆神经网络的语义角色标注方法,并装置了新颖的"直梯单元"(elevator unit,EU)。EU包含了对单元内部输入和输出的线性连接,使信息可以通畅地在不同层之间传播。通过EU,一个20层的LSTM网络可以得到比较充分的优化。重要的是,这个线性连接包含的"门"函数可以正则和控制信息在时间方向和空间方向上的传播。不同层次的抽象信息也可以被EU直接带到输出层进行语义角色标注。尽管这个模型非常简单,不需要任何额外的特征输入,但是它取得了理想的实验结果,在CoNLL-2005公开数据集上取得了F=81.56%的结果,在CoNLL-2012公开数据集上取得了F=82.53%的结果,比之前最好的结果分别提高了0.5%和1.26%。另外,在领域外的数据集上我们也取得了F值2.2%的显著提升,这是当前世界上最好的性能。该模型比较简洁,非常容易实现和并行,在单一的K40GPU上取得了每秒11.8K单词的解析速度,远远高于之前的方法。
We propose a new deep long short-term memory(LSTM) network equipped with elevator unit(EU) for semantic role labeling(SRL).The EU conducts a linear combination of adjacent layers which allows unimpeded information flowing across several layers.With the EU,a very deep stack LSTM up to 20 layers can be easily optimized.Specially,the connection also contains a gate function to regulate and control the information flow through space direction.The appropriate levels of representations are directly guided to the output layer for predicting the corresponding semantic roles.Although the model is quite simple with only the very original utterances as the input,it yields strong empirical results.Our approach achieves 81.56% F1 score on CoNLL-2005 shared datasets,and82.53% on CoNLL-2012 shared datasets,which outperform the previous state-of-the-art results by 0.5% and1.26%,respectively.Remarkably,we obtain surprisingly improvement in out-domain datasets by 2.2%in F1 score compared with previous state-of-the-art system.The model is simpl,and easy to implement and parallelize,yielding11.8 ktokens per second on a single K40 GPU.
作者
王明轩
刘群
WANG Mingxuan;LIU Qun(Key Lab of Intelligent Information Processing of Chinese Academy of Sciences, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;ADAPT Centre, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland.)
出处
《中文信息学报》
CSCD
北大核心
2018年第2期50-57,共8页
Journal of Chinese Information Processing
关键词
语义角色标注
深度学习
semantic role labeling, deep learning