摘要
为解决基于非结构化文本的中文领域本体概念提取效率和准确率不理想的问题,提出了一种基于关联规则和语义规则的领域本体概念提取方法。利用领域一致性和相关性检查以及关联规则分别获取候选概念和关系集合,计算候选概念在领域术语关系中的深度和广度,利用深度和广度信息反馈概念隶属度的思想,定量分析术语与领域的隶属程度,进行本体概念的领域隶属度检查,完成领域本体概念的提取。实验结果表明,该方法提高了领域本体概念的提取效率和准确率,具有可行性和合理性,领域本体概念的提取准确率提高了12%左右。
In order to solve the problems that extraction efficiency and the accuracy of Chinese domain ontology concept based on unstructured text is not ideal. We present a method of domain ontology concept extraction based on semantic rules and association rules. A set of candidate concepts and relationships are obtained by using field consistency, correlative checks and association rules, and the depth and breadth of relations of every concept in candidate concepts are computed, using the depth and breadth information to feedback the degree of membership between terminology and field, with the way of quantitative analysis to complete the extraction of domain ontology concepts. The experimental results show that this method has feasibility and rationality, the concept of domain ontology extraction accuracy increased by about 12%.
出处
《吉林大学学报(信息科学版)》
CAS
2014年第6期657-663,共7页
Journal of Jilin University(Information Science Edition)
基金
吉林省自然科学基金资助项目(20130101060JC)
吉林省教育厅十二五科学技术研究基金资助项目(2014131
2014125)
关键词
本体概念提取
关联规则
语义规则
领域隶属度检查
ontology concept extraction
association rules
semantic rules
domain membership checking