摘要
提出了一种基于汉语树库,采用决策树的方法识别汉语基本名词短语。其核心思想为:从语料库中自动抽取基本名词短语的词性模板以及其相应的上下文信息,采用ID3算法形成相应的决策树。该方法有效地引入了学习机制,提高了系统的性能和识别速度,具有较好的精确率和召回率。
This paper presents a Chinese treebank based decision tree approach to identify Chinese BNP. A self-learning mechanism is integrated into our model which includes the following steps: auto-extraction of POS string sequences (BNP rules) and their context information from the corpora and ID3 algorithm based tree training. Experimental results show good performances of our method.
出处
《黑龙江工程学院学报》
CAS
2004年第2期1-4,共4页
Journal of Heilongjiang Institute of Technology
基金
国家自然科学基金资助项目(60373101)
863计划支持项目(2002AA117010-09)