摘要
针对数据质量中约束规则描述的语义复杂性、快速提取等问题,引入本体技术描述数据质量约束规则等要素的语义关系,提出了元本体的思想,对数据质量领域的核心词汇进行了提练,并依据相关标准,构建了与领域无关的数据质量元本体模型。在应用中,特定领域可根据需求将该元本体模型实例化为用于描述本领域的数据质量本体,不仅解决了数据质量领域词汇共享与明确描述问题,而且使数据质量复杂约束规则语义描述得以解决。同时,以石油领域数据为例,依据提出的质量本体元模型实例化出石油领域的质量本体模型,定义了各种推理规则,并基于Jena推理机验证了构建的数据质量本体的合理性,极大地提高了数据质量评估中约束规则提取的效率。
In order to solve the semantic complexity description and rapid extraction of constraint rule in data quality, the ontology was introduced to describe the semantic relation of constraint rules of data quality and other factors. The idea of meta-ontology is proposed, the core vocabularies of data quality domain are refined, and according to relevant standards, a meta-ontology which is independent of domain is constructed. A specific domain can instantiate the meta ontology model into a quality ontology to describe the data in the domain as needed, which solves the problem of lexical sharing and explicit description in the field of data quality, and makes the problem of data quality complex constraint rule semantic description to be resolved. Finally, based on the data of the oil field, the quality ontology model of the oil field is defined according to the proposed mass ontology meta-model, and the reasoning rules are defined. Based on the Jena reasoning machine, the rationality of the constructed data quality ontology is verified, which greatly improves the efficiency of constraint rule extraction in data quality assessment.
出处
《吉林大学学报(信息科学版)》
CAS
2017年第6期670-677,共8页
Journal of Jilin University(Information Science Edition)
基金
东北石油大学国家基金培育基金资助项目(2017PYYL-06)