摘要
文章在多语种句法树库研究成果的基础上,构建了1万句左右的藏语依存句法树库,在参考了国内外多种依存句法标注规范的同时,结合藏语语法理论和语言类型学特征,制定了一套藏语依存句法标注体系.文章采用语料统计方法,验证了本句法标注体系的有效性和合理性,对词性分布和依存关系进行了统计,归纳了依存结构在语料库中的分布规律,为自动句法分析提供了语言学数据支撑.
About ten thousand sentences have been constructed in Tibetan-language dependency syntactic treebank based on the research results of multilingual syntactic treebank.A set of Tibetan dependency syntax annotation system was developed referring to a variety of dependency syntax annotation specification at home and abroad,additionally combined with the the theory of Tibetan grammar and typology.By utilizing corpus statistics,the validity and rationality of this grammar annotation system were verified.Part of speech distribution and dependency relationships were statistically counted.The distribution rules of dependency structure in the corpus were also summarized,which provided linguistic data support for automatic syntactic analysis.
作者
泽仁卓玛
祁坤钰
夏吾措毛
ZERenzhuoma;QI Kunyu;XIAWucuomao(China Institution of Information Technology for Nationalities,Northwest Minzu University,Lanzhou 730030,China)
出处
《西北民族大学学报(自然科学版)》
2024年第3期80-88,共9页
Journal of Northwest Minzu University(Natural Science)
基金
敦煌古藏文文献知识图谱构建与应用研究(23XTQ004)。
关键词
藏语
依存树库
依存体系
Tibetan
Dependency syntactic treebank
Dependency relation