摘要
语义角色标注的传统方法采用基于句法特征的统计机器学习方法。由于依存句法可以表示词语之间的语义关系,故在语义角色标注中取得了较好的性能;但该方法存在特征抽取过程繁琐,难以捕捉句子中长距离依赖等问题。随着深度学习的兴起,研究者将基于双向长短时记忆(Bidirectional Long Short-Term Memory,Bi-LSTM)神经网络模型用于语义角色标注。该模型可以自动学习特征,并对词与词之间的远距离依赖关系进行有效建模。本文提出融合Bi-LSTM-CRF模型与依存句法特征的方法,并且引入Gate过滤机制对词向量表示进行调整,以达到利用句法特征提高语义角色标注精度的同时,规避特征工程的繁琐。CPB上的实验结果表明,利用本文所提方法的汉语语义角色标注的F1值达到79.53%,比前人的方法有了较为显著的提升。
The traditional statistical methods which based on the syntactic features algorithm were frequently used for the Chinese semantic role labeling. Since the dependency parsing provides semantic relations between words, better performances in semantic role labeling were achieved. However, hand-crafted feature extraction process was complicated in such methods and it is difficult to capture the long range dependences in a sentence. With the development of deep learning, researchers have applied the bidirectional long short-term memory (Bi-LSTM) model to semantic role labeling, which is capable of learning features automatically and capturing long-range dependence. This paper proposed a method of combining model (Bi-LSTM) with dependency structure and introduced a Gated filtering mechanism (GFM) to adjust the word representation. Experimental results on CPB showed that the proposed method achieved 79.53% of F1 in Chinese semantic role labeling and significantly outperformed the previous work.
作者
张苗苗
刘明童
张玉洁
徐金安
陈钰枫
ZHANG Miaomiao;LIU Mingtong;ZHANG Yujie;XU Jinan;CHEN Yufeng(The School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)
出处
《情报工程》
2018年第2期45-53,共9页
Technology Intelligence Engineering
基金
北京交通大学人才基金(KKRC11001532)
国家自然科学基金(61370130
61473294)
北京市自然科学基金(4172047)