期刊文献+

汉语名物化复合词识别

THE IDENTIFICATION OF CHINESE NOMINALIZATION COMPOUNDS
下载PDF
导出
摘要 名物化复合词的识别是汉语复合词识别中的难点。困难之处在于汉语动词和名词共现时既可以构成动词短语也可以构成名物化复合词。传统的汉语复合词识别往往只使用语料统计特征,效果往往不怎么理想。基于最大熵模型,在基准上下文特征的基础上,采用了词汇特征与Web特征对动词和名词共现时的名物化候选进行判定,取得了较好的实验结果。其中,Precision达到了86.31%,Recall达到了70.00%。 The identification of nominalization compounds is very. difficult in Chinese compound recognition. When a verb and a noun cooccur,there will be an ambiguity as whether the expression is a verb phrase or a compound. Traditional identification of nominalization compounds is usually only based on the features from the corpus and the result is not very good. In this paper it uses a Maximum Entropy model to identify nominafization eompounds. Besides the baseline contextual features, the model also adopts lexical and Web features for the identification task. The experiment result is eneouraging. The Preeision and Recall is 86.31% and 70% respectively.
出处 《计算机应用与软件》 CSCD 北大核心 2008年第9期283-285,共3页 Computer Applications and Software
基金 国家自然科学基金项目(60496326)
关键词 最大熵模型 名词性复合词 复合能力 主题词表 Web特征 基于信息检索的点式互信息 Maximum entropy model Nominal compounds (NC) Compound ability (CA) Thesaurus Web features Point-typetu mutual information based on information retrieval (PMI-IR)
  • 相关文献

参考文献2

二级参考文献10

共引文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部