摘要
针对目前手工构建本体耗时费力这一难题,以茶学领域知识为研究对象,提出了一种本体的概念自动提取方法。该方法利用中文分词技术对茶学语料进行切分,使用互信息技术从切分后的语料中得出候选概念(合成词)集合,通过判断候选概念和非合成词的领域相关性,自动提取出茶学领域本体概念。以该方法为基础开发了相应的原型系统,实验结果表明,该方法是有效的。
Aimed at reducing the time and labor required in current manual ontology construction,an automatic method of extracting ontology concept was proposed in this paper,based on the knowledge of tea domain.Firstly,tea domain corpus was processed by Chinese word segmentation,and the set of candidate concepts(compound words) was obtained by mutual information technology,and then the ontology concepts of tea domain were automatically extracted by judging the domain coherence of the candidate concepts.Experiment was performed by related prototype system,and the results proved that the method was effective.
出处
《农业网络信息》
2010年第8期13-15,24,共4页
Agriculture Network Information
基金
国家863计划项目"农业知识网格的构建及关键技术研究"(编号2006AA10Z249)
关键词
茶学本体
概念提取
互信息
领域相关性
tea domain ontology
concept extraction
mutual Information
domain coherence