摘要
概念获取是自然语言理解领域中重要的研究课题。该文提出了一种基于汉语量词的名词概念描述方法,设计并实现了一个权重计算方案。通过聚类实验探索了量词对名词语义区分的作用和贡献,实验结果表明基于量词的名词概念表达方式是有效的,可以区分大部分名词概念。
Concept acquisition from corpora has become increasingly important in NLP. This paper presents a new concept representation based on classifier words. Concepts are modeled as vectors with one component corresponding to each classifier word. We propose a weighting scheme that assigns each classifier word a weight in a concept. Then we conduct experiments to identify concept similarities via clustering, and the results show classifier words can categorize most concept classes.
出处
《中文信息学报》
CSCD
北大核心
2014年第5期60-65,共6页
Journal of Chinese Information Processing
基金
国家自然科学基金(No.61300152)
关键词
概念获取
量名搭配
量词
聚类
Concept acquisition
classifier-noun collocation
classifier words
cluster