摘要
提出一种基于最大熵模型和投票法的汉语动词与动词搭配识别方法.该方法通过组合目标动词与候选搭配词的上下文词性信息以及关联程度的统计信息构成5种复合特征模板,然后利用最大熵方法获得它们对应搭配识别器,最后采用最好搭配识别器占优的投票法构造组合识别器.实验结果表明,同时包含上下文词性信息和统计信息的识别器优于单纯包含上下文词性信息或统计信息的识别器,但最好搭配识别器占优的组合识别器效果更佳.
In this paper,a method for verb-verb collocation recognition is proposed based on maximum entropy model and voting. Firstly, by combining some information about part of speech in context and the associations' strength between target verbs and candidate collocation verbs,5 kinds of composite feature templates are constructed. And then the corresponding discriminators are obtained by using maximum entropy model. Lastly,a combined discriminator is established by voting of the best dis- criminator priority. The experiment results indicate that discriminators with both information about part of speech in context and information about the strength of associations between target verbs and collocation verbs are more efficient than that only with single information respectively, however, the combined discriminator is the best.
出处
《小型微型计算机系统》
CSCD
北大核心
2007年第7期1306-1310,共5页
Journal of Chinese Computer Systems
基金
国家自然基金项目(60573074)资助
山西省自然科学基金项目(20041040)资助
山西省科技攻关项目(051129)资助
山西省青年科技基金项目(20031027)资助
关键词
搭配
最大熵模型
特征函数
投票法
collocation
maximum ehtropy model
feature function
voting