摘要
设计化学主题数据库,实现中国科学院化学化工领域的数据库集成,方法是比较3种主流的数据集成法(数据仓库、联邦数据库集成模式),归纳出联邦数据库集成模式,其优势在于保留了成员数据子库的独立性,各子库可独立地进行维护和更新,它适用于数据类型差别较大,分布异构而且不便统一更新的中国科学院化学化工数据子库实现数据集成。针对中国科学院化学化工领域数据子库的特征,在传统的联邦数据库集成模式上增加数据集成模型作为扩展,以便将数据资源组织起来,构成一个基于化合物唯一标识的相互联系的数据集成平台。在数据集成模型的设计上,比较了以学科分类为根节点和以化合物为根节点2种不同的模型建立方式,其中以化合物为根节点的概念树模型(数据集成模型)能够明显简化数据库用户的检索步骤,有利于化学化工数据库的集成与表达。在用户接口方面,本文着重设计了统一检索入口和可视化显示界面,前者解决了用户在不同的专业数据库之间跳转的问题,后者将来自不同数据源的检索结果按照预设数据模型,分层级分节点的显示给用户。
The biggest problem of the current chemistry databases in Chinese Academy of Sciences is that they are devoid of integration.As a result,users have to jump between different professional databases when they want to acquire detailed information about a compound.Therefore,it is necessary to establish a unified framework as a data platform to realize data integration for chemistry databases in Chinese Academy of Sciences.The paper shot the target via establishing the Chemistry Subject Database system.In the choice of data integration methods,after comparing between the mainstream data integration methods(Data Warehouse,Federal Database Architecture),the paper found out that Federated Database Architecture could realize data integration while maintaining the separation of each sub-database as well as keeping its characteristics,making the method extremely suitable to the chemistry sub-databases where data is not only quite different in types but also updating very quickly.Furthermore,the paper expanded the traditional method via adding a concept tree as a data integration model and then building up the framework of Federated Database Architecture based on the concept tree model.In terms of model design,the paper compared the subject-oriented method with the compound-oriented method and found out that data integration model based on compound could simplify the search process thus was much more feasible and reliable.As to the users' interface,the paper focused on two things:one was the unified search entrance,and the other was the visual results-displaying interface.The former one improved users' experiences when they tried to acquire large amount of data whilst the latter one provided users with level-classified search results coming from different sources.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2010年第12期1655-1659,共5页
Computers and Applied Chemistry
基金
中国科学院信息化专项资助项目(INF-115-C01-SDB3-03)
关键词
联邦数据库集成模式
数据集成
概念树
化学主题数据库
federated database architecture
data integration
concept tree
chemistry subject database