一种可复用的大数据集成模式

Research on a reusable integration model of big data

下载PDF

导出

摘要数据集成,是大数据或数据仓库中很重要的一个操作环节,它直接关系到数据的使用是否顺利、便捷、准确。在企业级大数据或数据仓库中,数据集成是一个持续性的工作,经常会不断有新数据源加入并被要求集成到已有的大数据仓库表中。这个环节是一个非常耗时耗力的过程,而且一旦集成方案不恰当,就会导致一些更加耗时耗力的处理工作,例如需要大批量处理已有数据,以及需要大量修改已有调用逻辑,即数据割接、应用割接。文章旨在研究一种较为系统的、通用的模式,尽量避免以上情况的发生,从而让数据集成工作变得简单、可复用、可持续,也为数据使用提供稳定性、准确性。 Data integration is an important operation link in big data or data warehouse, which is directly related towhether the data is used smoothly, conveniently and accurately. In enterprise-level big data or data warehousing, dataintegration is a continuous effort, and the new data sources are constantly being added and required to be integrated intothe existing big data warehouse tables. This link is a very time-consuming and labor-intensive process, and once theintegration scheme is not appropriate, it will lead to some more time-consuming and labor-intensive processing, such asthe need to process existing data in large quantities, and the need to modify a large number of existing calling logic. Thatis data cutting and application cutting. This paper hopes to find a more systematic and versatile model to avoid thesesituations, so that data integration is simple, reusable, sustainable, and provides stability and accuracy for data usage.

作者景超田凌涛 Jing Chao;Tian Lingtao(ENC Digital Technology Co.,Ltd.,Nanjing 210005,China;Nanjing Jing Wei Patent&Trade Mark Agency Co.,Ltd.,Nanjing 210005,China)

机构地区新智认知数字科技股份有限公司南京经纬专利商标代理有限公司

出处《江苏科技信息》 2019年第27期56-58,共3页 Jiangsu Science and Technology Information

关键词数据集成大数据数据仓库数据处理数据融合 big integration big data data warehouse data processing data fusion

分类号 TP3-0 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

1张莉.关于BOSS系统数据割接前期准备的研究[J].辽宁广播电视技术,2017,0(4):24-25.
2张铭.城市轨道交通线网数据中心与评估决策平台[J].智能系统学报,2018,13(3):458-468. 被引量：7
3古玲.浅谈蜻蜓翅膀元素在造型基础中的多样性研究[J].明日风尚,2017,0(7):366-366. 被引量：1
4许秋萍.PDCA循环法在消毒供应中心可复用器械管理中的应用[J].名医,2019,0(7):54-54. 被引量：3
5吴信东,董丙冰,堵新政,杨威.数据治理技术[J].软件学报,2019,30(9):2830-2856. 被引量：174
6郭鑫."朱雀"待飞--蓝箭航天的可复用液体火箭蓝图[J].航天员,2019,0(4):23-26.
7宋华锋.基于油田NGN组网优化后割接思路及故障处理的探讨[J].通信管理与技术,2018,0(3):42-45. 被引量：1
8张兴霖.大数据环境下的科技管理方法研究[J].中国新通信,2019,21(13):68-68. 被引量：1
9戚龙.用最小二乘法求解线性回归方程的算法研究[J].计算机产品与流通,2019,0(9):230-230.
10程佳妮,郭亨波.三维可视化运维平台与综合监控系统数据集成的研究与应用[J].现代建筑电气,2019,10(9):13-18. 被引量：7

江苏科技信息

2019年第27期

浏览历史

内容加载中请稍等...

一种可复用的大数据集成模式

相关作者

相关机构

相关主题

浏览历史