摘要
大数据时代,异构数据的集成是连接数据孤岛的重要内容,也是近年来持续热门的研究方向。作为馆藏机构,国家标准馆内部也存在多源异构数据的问题,主要体现在来自不同国家、不同数据结构文献的集成。传统人工编制映射规则将异构变成同构的做法,难以满足大数据的集成需求。本文主要阐述异构数据集成的概念和关键技术,以及国家标准馆应用NoSQL的系统架构和实践成果。
In the era of big data, heterogeneous data integration is an important part of the data island connection, and a continuous hot research direction in recent years. As a collection agency, the National Standard Library faces multi-source heterogeneous data problems itself, mainly about the integration to the literature from different countries with different data structure. In tradition, the mapping rules are compiled manually to transfer heterogeneous data to isomorphism one, however this kind of method can't meet the demand of big data integration. This paper mainly explains the concept of heterogeneous data integration and the key technology, system architecture and practical results of national standard library's application about NoSQL.
出处
《标准科学》
2016年第1期12-15,共4页
Standard Science
基金
中央基本科研业务费支撑项目"基于文本相似度计算的标准查新工具研究"(项目编号:252015Y-4003)资助
关键词
异构数据
标准
集成
NOSQL
sheterogeneous data
standard
integration
NoSQL