摘要
面向数据的分析技术(Data-Oriented Parsing,DOP)是一种概率分析策略,其概率模型的主要目的在于为一个给定的句子找到最可能的分析,即分析消歧。实际上,有关算法计算复杂度的大量研究证明,该类消歧问题属于NP-完全问题。因此,为有效实现最可能的分析,国外学者提出许多近似分析算法。本文主要论述在 DOP 框架中,基于 Monte Carlo 方法找到最可能分析的近似分析算法,并说明该方法可在合理的算法时间代价范围内实现,而且在统计上受控,以确保所获得的近似解确实对应着分析消歧后的精确解。
Data-Oriented Parsing(DOP)technique is a kind of probabilistic parsing strategy. The main goal of DOP model is to find the most probable parse for a given input sentence, that is, parse disambiguation. In fact, it is proved through a lot of research work about algorithm computation complexity that this kind of disambiguation problem belongs to the class of NP-Complete problem. So in order to implement the most probable parse efficiently, some researchers have proposed many approximation parsing algorithms. This paper mainly presents a kind of approximation parsing algorithm based on Monte Carlo method in DOP framework, which can be implemented at reasonable(i, e. polynomial)algorithmic cost. And at the same time, under statistical control, it is guaranteed that an obtained approximate solution indeed corresponds to an exact solution of the problem after disambiguation.
出处
《计算机科学》
CSCD
北大核心
2006年第3期174-178,共5页
Computer Science
基金
本文得到国家自然科学基金(编号:60203010
70501018与605333100)
上海财经大学"211工程"(2004年)资助