
统计句法分析建模中基于信息论的特征类型分析 被引量:4

The Information-Theory-Based Feature Type Analysis in the Modeling for Probabilistic Parsing
摘要 统计句法分析利用概率评价模型评价每棵候选句法树存在的可能性 ,选择概率值最高的候选句法树作为最终的句法分析结果 .因此 ,统计句法分析的核心是一个概率评价模型 ,而各种概率评价模型的本质区别主要在于它们分别是根据上下文中的哪些特征来赋予句法树概率的 .在统计句法分析研究领域 ,虽然已经提出了大量的概率评价模型 ,然而 ,不同的模型用到了不同类型的特征 .如何评价这些特征类型对于句法分析的作用呢 ?针对以上的问题 ,本研究为统计句法分析提出了一种特征类型的分析模型 ,该模型可以从信息论的角度量化地分析不同类型的上下文特征对于句法结构的预测作用 .其基本思想是利用信息论中熵与条件熵的度量来显示一个特征类型是否抓住了预测句法结构的主要信息 .如果加入某个特征类型之后当前句法结构的不确定性 (熵 )明显下降 ,则认为该特征类型抓住了上下文中影响句法结构的某些主要信息 .特征类型分析的信息论模型利用预测信息量、预测信息增益、预测信息关联度以及预测信息总量四种度量从不同的侧面量化地分析各种特征类型及特征类型组合对于当前目标的预测作用 .实验以 Penn Tree Bank为训练集 ,将上下文中不同的特征类型对于句法分析规则的预测作用进行了系统的量化分析 。 The paper proposes an information-theory-based feature type analysis model. Using the method, we can quantitatively analyze the power of different feature types for syntactic structure prediction from the viewpoint of information theory. The basic idea is that we use entropy and conditional entropy to measure whether a feature type grasps some of the information for syntactic structure prediction. If the average uncertainty of the syntactic structures declines apparently, the feature type is deemed to have grasped some intrinsic linguistic information in the context that has close relation to the syntactic structure. Using Penn-Treebank training and testing set, our experiment quantitatively analyze the different feature types' predictive power for syntactic structure predictive power for syntactic structure prediction in a systematic way and draws a series of conclusions which reflect the predictive power of different feature types and feature type combination for syntactic parsing.
出处 《计算机学报》 EI CSCD 北大核心 2001年第2期144-151,共8页 Chinese Journal of Computers
基金 国家"九七三"项目! (G19980 30 5 0 7-4 ) 国家自然科学基金! (6 94830 0 3)资助
  • 相关文献


  • 1[1]Kenneth Ward Church. A stochastic parts program and noun phrase parser for unrestricted text. In: Proc 2nd Conference on Applied Natural Language Processing, ACL, Austin, Texas, 1988. 136-143
  • 2[2]Magerman D M, Marcus M P. Pearl:A probabilistic chart parser. In: Proc European ACL Conference, Berlin, Germany,1991, http://www-cs-students.stanford.edu/~magerman/pubs.html
  • 3[3]Briscoe T, Carroll J. Generalized LR parsing of natural language (corpora) with unification-based grammars. Computational Linguistics, 1993, 19(1):25-60
  • 4[4]Magerman D M, Weir C. Probabilistic prediction and Picky chart parsing. In: Proc DARPA Speech and Natural Language Workshop, Arden House, NY, 1992, http://www-cs-students.stanford.edu/~magerman/pubs.html
  • 5[5]Magerman D M. Statistical decision-tree models for parsing. In: Proc 33th Annual Meeting of the ACL, Cambridge, MA, 1995. 276-283
  • 6[6]Collins M J. A new statistical parser based on bigram lexical dependencies. In: Proc 34th Annual Meeting of the ACL, Santa Cruz, CA, 1996.184-191
  • 7[7]Charniak E. Statistical parsing with a context-free grammar and word statistics. In: Proc 14th National Conference on Artificial Intelligence, Menlo Park, CA, 1997. 598-603
  • 8[8]Black E, Jelinek F, Lafferty J et al. Towards history-based grammars: Using richer models of context in probabilistic parsing. In: Proc 31st Annual Meeting of the ACL, Columbus, Ohio, 1993. 31-37
  • 9[9]Marcus M P, Santorini B, Marcinkiewicz M A. Building a large annotated corpus of English:The Penn treebank. Computational Linguistics, 1993, 19(2):313-330
  • 10[10]Bell T C, Cleary J G, Witten I H. Text compression. Englewood Cliffs, New Jersey 07632: Prentice Hall, 1992


  • 1董振东.语义关系的表达和知识系统的建造[J].语言文字应用,1998(3):79-85. 被引量:59
  • 2由丽萍,范开泰,刘开瑛.汉语语义分析模型研究述评[J].中文信息学报,2005,19(6):57-63. 被引量:22
  • 3秦春秀,赵捧未,刘怀亮.词语相似度计算研究[J].情报理论与实践,2007,30(1):105-108. 被引量:30
  • 4[2]Darroch J N,Ratcliff D.Generalized iterative scaling for log-linear models[J].The Annals of Mathematical Statistics, 1972;43(5): 1470-1480
  • 5[3]Au R Rosenfeld. Adaptive language modeling using the maximum entropy principle[C].ln:Proceedings of the Human Language Technology Workshop ,ARPA: 1993: 108-113
  • 6[4]Rosenfeld R.A maximum entropy approach to adaptive statistical language modeling[J].Computer, Speech, and Language, 1996; 10
  • 7[5]Jaynes E T.Notes on present status and future prospects[C].ln:Grandy W T,Schick L Heds. Maximum Entropy and Bayesian Methods,Kluwer: 1990:1-13
  • 8Quillian M R. Semantic memory[ M]//Minsky M Y. Semantic In- formation Processing. Cambridge: MIT Press, 1968.
  • 9Sowa J F. Conceptual structures:Information processing in mind and machine[ M]. Boston: Addison - Wesley Longman Publishing Co. , Inc. , 1984.
  • 10Gruber T R. A translation approach to portable ontology specifica- tions[J]. Knowledge Acquisition, 1993, 5(2) : 199 -220.










使用帮助 返回顶部