摘要
提出并建立了一种基于CBC聚类方法的领域术语自动获取模型,该模型避免了单纯以领域相减或统计方法获取领域术语的局限性。并引入修正的余弦公式来进行术语间相似度的计算,来实现领域术语自动获取系统的核心模块。
Automatic Acquisition of Domain Terms is an important research issue in natural language processing (NLP). Domain database is mainly constituted by manpower, and it costs immensely and evolves slowly. This paper proposes an automatic acquisition domain terms model based on CBC clustering approach, many disadvantages are avoided in this model. Meanwhile, it modifies Cosine Coefficient to calculate mutual information to design the key part.
出处
《软件导刊》
2008年第9期23-24,共2页
Software Guide
基金
国家教育部重点研究基地重大项目(07JJD74006)
湖北省科技攻关项目(2007AA101C49)
关键词
术语
领域术语
CBC聚类
中文信息处理
term
domain term
CBC(clustering by committee)
natural language processing