摘要
提出一种新的、基于逻辑的数据集成应用方案:用描述逻辑表达中介模式,能实现基于LAV源描述法的虚拟数据集成技术与物化数据仓库技术的无缝结合.在该集成应用框架下,利用Datalog谓词逻辑推理与描述逻辑自动推理相结合的混合推理机制,设计了一个集成查询重写处理算法,并将其作为实现集成系统查询处理器的基础.结果表明,当查询表达和源视图描述规则均为合取形式的规则时,该算法总能返回一个具有最大包含的查询重写,且对源描述规则数目增加不敏感,有较好的线性可伸缩性,能适应大量数据的集成处理.
A logic-based scheme for data integration was proposed, and a query answering processor(QAP), which is a core component of the system, was developed. Data integration is a problem of combining the data residing at different, heterogeneous sources, and providing the user with a unified data view, called mediated schema. It is the task of the system to free the user from the knowledge of where data are, and how data are structured at the sources. In this scheme architecture, the data sources were defined as views over the mediated schema in a paradigm of local-as-view(LAV), the data storage was managed following a quasi-virtual approach, i. e. , while the data still residing at the sources during query processing, a data warehouse, treated as norm-data source, was used seamlessly as an optional enhanced component/or data storage buffering as well. Moreover, a logic of the description logics (DL) family was used to model the mediated schema, to formulate queries posed to the system, and to perform several types of automated reasoning supporting both the modeling and the query answering process. By employing a hybrid reasoning method, which is a hybrid of the Datalog inference in first order predicate logic and automatic reasoning services of description logic, an algorithm, used by QAP to rewrite user queries using views, was presented and illustrated. The study shows that, when the query and views are conjunctive, the algorithm is always able to produce a maximally-contained rewriting, and is scales up well in the presence of a large number of views.
基金
国家自然科学基金(60401015)资助