
基于树库的现代汉语短语分布考察 被引量:6

A Study on Grammartical functions of Phrases in Mandarin Chinese Based on Chinese TreeBank
摘要 自动句法分析中需要确定短语实例的语法功能。文章试图在大规模汉语树库统计的基础上对汉语短语的语法功能分布进行定量分析,以此评估确定短语实例语法功能的三种方法。首先描述了汉语短语结构类充当11种语法功能的情况,然后对短语功能类充当各种语法功能进行统计与分析,最后使用了核心词来估计定中短语的语法功能。在比较分析了不同方法估计短语实例的语法功能的效果后得出结论:汉语短语的语法功能表现出一定的聚合性,但自动句法分析中以类标记来估计短语语法功能效果欠佳。 The grammatical function Based on the large-scale treebank corpus, of a phrase instance should be determined while parsing. the paper makes an investigation into the grammatical functions of Phrases in Mandarin Chinese. The paper first describes the distribution of the grammatical functions of six kinds of phrases categorized by structure with statistical data based large-scale tree corpus. And then the distribution of the grammatical functions of 11 kinds of phrases categorized by function is explored. Finally, The paper compares several methods which are used to distinguish the grammatical function of an instance of np-DZ phrase, and the conclusion that a parser would hardly performance well if it distinguishes the grammatical function of an instance of phrase only by category.
作者 陈锋 陈小荷
出处 《语言科学》 CSSCI 2008年第1期12-17,共6页 Linguistic Sciences
基金 国家社会科学基金项目(项目号:07BYY050)的资助
关键词 自动句法分析 现代汉语 短语 语法功能 树库 parse Mandarin Chinese phrase grammatical functions treebank
  • 相关文献



  • 1戴浩一.概念结构与非自主性语法:汉语语法概念系统初探[J].当代语言学,2002,4(1):1-12. 被引量:109
  • 2Brants, S., & Hansen, S. (2002). Developments in the TIGER annotation scheme and their realization in the corpus[A]. In: Proceedings of the Third Conference on Language Resources and Evaluation (LREC-02)[C]. Las Palmas de Gran Canaria, Spain. 1643-164
  • 3Collins, M. (1999) Head-Driven Statistical Models for Natural Language Parsing[D]. Ph.D. Thesis. Dept. of Computer Science and Information, The University of Pennsylvania.
  • 4Hajic, J. (1999). Building a syntactically annotated corpus: The Prague Dependency Treebank[A]. In: E. Hajicova (Ed.), Issues of valency and meaning. Studies in honour of Jarmila Panevova. Prague, Czech Republic: Charles University Press.
  • 5Chu-Ren Huang, Feng-Yi Chen, Keh-Jiann Chen, & al.(2000). Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface[A], Proceedings of the Second Chinese Language Processing Workshop[C], HongKong. 29-37.
  • 6Kingsbury, P.; Martha Palmer, and Marcus, M. (2002). Adding Semantic Annotation to the Penn TreeBank[A]. In: Proceedings of the Human Language Technology Conference[C], San Diego, California.
  • 7Leech, G.; and Garside, R. (1991). Running a grammar factory: The production of syntactically analysed corpora or ‘treebanks' [A]. In: Stig Johansson and Anna-Brita Stenstrom (eds.) English Computer Corpora: Selected papers and Research Guide. 1991. 15-3
  • 8Marcus, M., Kim, G., Marcinkiewicz, M.,& al. (1994). The Penn Treebank: Annotating predicate argument structure [A]. In: Proc. of the ARPA Human Language Technology Workshop[C]. San Francisco, CA.
  • 9Mitchell P.Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini (1993). Building a Large Annotated Corpus of English: The Penn Treebank[J], Computational Linguistics, 19(2):313-330.
  • 10Stephan Oepen, Dan Flickinger, Kristina Toutanova, et. al. (2002). LinGO Redwoods-A Rich and Dynamic Treebank for HPSG [A]. In: Proc. of First Workshop on Treebanks and Linguistic Theories (TLT2002) [C]. 139-149.












使用帮助 返回顶部