摘要
句法标注是语料库深加工的重要环节,是树库开发与研究的重要方法。本文对当前面向语料库深加工的自动依存句法标注方法展开调查,对依存句法标注所涉及的语法模型、标注体系和自动句法分析方法分别进行了阐述和比较,最后分析了自动依存句法标注方法应用于语料库加工所面临的问题,旨在为今后的英汉双语依存树库的研制提供思路和方法。
Syntactic annotation, an important part in corpora deep processing, serves as an indispensable prerequisite for Treebank development and language studies. This article therefore reviews the methodologies over automatic dependency annotation to date, in an attempt to provide references for the development of English- Chinese dependency Treebank. It first depicts and compares the linguistic models, annotation schemes and automatic dependency parsing mechanisms underlying dependency syntactic annotation. The article also addresses potential problems that may arise in corpus processing when incorporating methods and techniques of automatic dependency annotation.
出处
《现代外语》
CSSCI
北大核心
2018年第2期279-289,共11页
Modern Foreign Languages
基金
教育部人文社会科学青年基金项目"英汉双向平行依存树库的创建及翻译深层结构转换模式究"(17YJC740052)
中国博士后科学基金第61批面上项目(2017M610810)资助
关键词
语料库
句法标注
依存句法
自动分析
数据驱动
corpus
syntactic annotation
dependency grammar
automatic parsing
data-driven