摘要
本文提出一种以DOP技术作为基本框架,同时利用基于相似的概率评估技术,实现汉语句法分析的方法。其中,对于输入语句,首先需要经过词汇层与词性层两层初选。然后,基于已构建知识源,获取输入语句的片段组合形式。最后,对输入语句与初选结果进行相似性评估,完成输入语句的组合分析过程。为论证方法有效性,基于包含1 000 个语句的真实汉语语料构建知识源,并采用包含100 个语句的真实汉语语料作为测试集。实验表明,句法分析的各项指标都比较令人满意,可有效地实现汉语句法分析。
This paper presents a kind of Chinese parsing method which takes the DOP technique as the basic frame and utilizes the similarity based probabilityestimate technique. In the implementation, every input sentence must by preprocessed through the initial selection in word level and part of speech level. Then the fragment combination forms of the input sentence are acquired based on the constructed knowledge source which includes treebank, fragment bank and fragment combination bank. Finally, the similarity estimate between the input sentence and the initial selection result is proceeded by using the similarity based probability estimate technique. So the combination parsing process of the input sentence can be completed successfully. To prove the efficiency of the proposed method, the knowledge source is constructed based on the real world Chinese corpus which involves 1 000 Chinese sentences, and the other real world Chinese corpus which includes 100 Chinese sentences is used as the test set. The experiment result shows that every test parameter is satisfactory and the parsing process can be implemented efficiently.
出处
《中文信息学报》
CSCD
北大核心
2000年第1期13-21,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金!(编号:69675019)
国家教委博士点专项基金
关键词
DOP
汉语
句法分析
相似性评估
树库
片段库
Data oriented parsing Chinese parsing Similarity estimate Treebank Fragment bank Fragment combination form bank