
汉语显式篇章关系分析 被引量:1

Explicit Discourse Relation Parsing of Chinese Text
摘要 篇章关系分为显式和隐式两种。显式关系的显著特征是篇章的基本单元之间存在显式连接词。针对汉语显式篇章关系,构建了包括汉语连接词识别和篇章关系分类的显式篇章关系分析平台。该文选取汉语宾州树库(Chinese Penn Treebank,CTB)中的500篇文本进行了汉语显式篇章关系标注;结合连接词的中心词,采用最大熵分类器构建了汉语连接词识别模块,其性能F1值达到了66.79%;基于连接词及其词性等上下文特征,构建了篇章关系分类器,其在最顶层4大类语义关系上的分类性能的F1值为91.92%。 Discourse relations can be expressed explicitly or implicitly. This paper focuses on explicit discourse rela- tions that are explicitly signaled by discourse connectives. We propose an explicit discourse relation parsing plat- form, containing connective identification and sense classification. Using 500 texts from the Chinese Discourse Tree- Bank corpus (CTB), we annotate an explicit discourse relations corpus. Considering headwords of connectives, we construct a connective identifier using maximum entropy based on this corpus, which reports F1 of 66.79%. And a sense classifier based on the context of connective itself is proposed and reports F1 of 91.92%.
出处 《中文信息学报》 CSCD 北大核心 2014年第6期101-106,共6页 Journal of Chinese Information Processing
基金 国家自然基金(61333018) 国家自然基金(61273320) 国家863(2012AA011102)
关键词 连接词识别 语义关系分类 最大熵分类器 connectives identification sense classification maximum entropy classifier
  • 相关文献


  • 1http://maxent.sourceforge.net/.
  • 2http://code.google.com/p/berkeleyparser/.
  • 3Rashmi Prasad,Nikhil Dinesh,Alan Lee,et al.The Penn Discourse Treebank 2.0[C] //Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008),2008:2961-2968.
  • 4Alsaif A,Markert K.Modelling discourse relations for Arabic[C] //Proceedings of the Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2011:736-747.
  • 5Xue N.Annotating discourse connectives in the chinese treebank[C] //Proceedings of the Workshop on Frontiers in Corpus Annotations Ⅱ:Pie in the Sky.Association for Computational Linguistics,2005:84-91.
  • 6Berger A L,Pietra V J D,Pietra S A D.A maximum entropy approach to natural language processing[J] .Computational linguistics,1996,22(1):39-71.
  • 7PDTB-Group.The Penn Discourse Treebank 2.0 Annotation Manual[OL] .The PDTB Research Group.2007.
  • 8Emily Pitler,Ani Nenkova.Using syntax to disambiguate explicit discourse connectives in text[C] //Proceedings of the ACL-IJCNLP Conference Short Papers,Singapore,2009.
  • 9Ziheng Lin,Hwee Tou Ng,Min-Yen Kan.A PDTB-styled end-to-end discourse parser[J] .Natural Language Engineering.2012,1(1):1-35.
  • 10Ramesh Balaji,Hong Yu.Identifying discourse connectives in biomedical text[C] //Proceedings of AMIA Ann Symp Proc 2010:657-661.










使用帮助 返回顶部