摘要
针对中文文本的医用酶本体概念提取问题,提出复合种子概念提取与扩展概念提取相结合的概念提取方法.前者是指通过种子为中心词生成概念,通过互信息判断种子和搭配词之间是否成词;后者是指通过领域隶属度设置适应度函数,结合人工蜂群算法操作简单、鲁棒性强的优势提取扩展概念.复合种子概念与扩展概念构建候选概念集,最后由专家确定领域概念.针对医用酶的中文语料进行了验证性实验,经专家确认该方法用于概念提取效果良好.
Aiming at concepts extraction of medical enzyme ontology in Chinese text, a new concept extraction method is proposed, which contains the extraction of composite seed concept and the extrac- tion of extend concept. The extraction of composite seed concept is the process, which generates con- cepts through thinking seeds as central word, and determines whether into words by mutua[ informa- tion. The extraction of extend concept designs the field of membership to set the fitness function, com- bines with artificial bee colony the advantage of simple and robust which solve the problem of algorithm premature and redundancy rules. They build the candidate set of concepts, and experts determine the field concepts at last. Experimental results show that the method performs better in Chinese text of medical enzyme.
出处
《聊城大学学报(自然科学版)》
2013年第3期91-95,共5页
Journal of Liaocheng University:Natural Science Edition
基金
山东省高校科研发展计划(J11LG57)
山东省科学技术厅星火计划(2011XH17006)
关键词
本体学习
人工蜂群
种子
概念提取
医用酶
ontology learning, artificial bee colony, seed, concept extraction, medical enzyme