摘要
为在领域本体学习过程中实现最优同领域概念聚类并解决概念重叠问题,通过引入图熵极值理论,提出一种新的领域概念聚类方法。依据最大信息熵原理,将图中各概念节点视为一个整体以取代原选取质心的方法,同时利用图熵最小化计算公式设计概念自动聚类机制。实验结果表明,与K-means算法、基于密度和基于距离的领域概念聚类方法相比,该方法可有效提高查准率、查全率以及综合评估指标F值。
In domain ontology learning,in order to implement optimal clustering of concepts of the same domain without concept overlapping,this paper introduces the graph entropy extreme value theory and proposes a domain concept clustering method.According to the principle of maximum information entropy,the concept nodes of a graph are considered as a whole instead of selecting the centroid.Also,the graph entropy minimization formula is used to design an automatic concept clustering mechanism.Experimental results show that,compared with K-means algorithm,density-based and distance-based domain concept clustering methods,the proposed method significantly improves the precision,recall rate and comprehensive evaluation index,F value.
作者
安敬民
李冠宇
AN Jingmin;LI Guanyu(School of Computer and Software,Dalian Neusoft University of Information,Dalian,Liaoning 116023,China;Information Science and Technology College,Dalian Maritime University,Dalian,Liaoning 116026,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2020年第6期88-93,共6页
Computer Engineering
基金
国家自然科学基金(61371090,61602075)
辽宁省自然科学基金(20180550940)。
关键词
领域概念
领域本体
概念重叠
图熵
概念聚类
domain concept
domain ontology
concept overlapping
graph entropy
concept clustering