期刊文献+

基于监控PaaS的大数据治理研究 被引量:2

Research on big data governance based on monitoring PaaS
下载PDF
导出
摘要 为解决区域能源互联网数据质量差的问题,设计了基于监控PaaS的大数据治理框架,通过PaaS平台、DM8MPP数据库.Spark计算引擎等技术,支撑构建消息传输桥接代理、分片内存索引快照、流计算数据清洗、数据混合存储等传统ETL工具缺乏的流计算模块,实现云原生数据采集处理与数据治理技术的融合互补。本文采用Kafka消息动态代理,实现跨层、广域、无感知的消息发布/订阅;基于HashMap和智能指针构建分片内存快照,扩展实现索引快照,实现跨节点、跨容器的快速信息访问;研究流计算数据清洗技术,实现多源遥测、遥信的连续真值计算;研究混合数据存储模型,解决大数据平台多样化数据的分布式存储及访问问题。通过仿真测试结果分析,本研究提高了能源大数据接入及存储水平,为数据服务总线提供更方便的数据访问支持。 In order to solve the problems of poor data quality of regional energy internet,a big data governance framework based on monitoring PaaS is designed.Through PaaS,DM8MPP,Spark computing engine and other technologies,it supports the construction of streaming computing module,which is lack of traditional ETL-tools such as message transmission bridging agent,partitioned memory index snapshot,streaming data cleaning,data mixed storage,to realize the integration and complementarity of scada and data governance using cloud native technology.In this paper,Kafka dynamic agent is used to realize cross layer,wide area and non perception message publish/subscribe;based on HashMap and SmartPointers,partitioned memory snapshot is constructed,and index snapshot is extended to realize message access across nodes and dockers;streaming data cleaning technology is studied to realize continuous truth calculation of multi-source telemetry and remote signaling;hybrid data storage model is studied to solve the problem of distributed storage and access in big data platform.Through the analysis of simulation test results,this study improves the access and storage level of energy big data,and provides more convenient data access support for data service bus.
作者 王军 宋尧 于全喜 宁楠 廖清阳 WANG Jun;SONG Yao;YU Quanxi;NING Nan;LIAO Qingyang(Guian Power Supply Bureau of Guizhou Power Grid Co.,Ltd.,Guiyang 550025 Guizhou,China;Technology Center of Dongfang Electronics Co.,Ltd.,Yantai 264000 Shandong,China)
出处 《电力大数据》 2020年第9期50-57,共8页 Power Systems and Big Data
关键词 数据治理 云原生 流计算 弹性消息队列 实时数据库 混合存储 data governance cloud native stream computing elastic message queue real time database hybrid storage
  • 相关文献

参考文献12

二级参考文献232

  • 1涂方亮,吴静怡.ANFIS实现依据人数变化来预测建筑负荷[J].土木建筑与环境工程,2012,34(S2):99-102. 被引量:1
  • 2陶春,张亮,施伯乐.基于本体的XML数据集成的查询处理[J].计算机研究与发展,2005,42(3):468-477. 被引量:15
  • 3刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 4Big data: Science in the petabyte era. Nature, 2008, 465 (7209) : 1-136.
  • 5Carney D, Cetintemel U, Cherniack M, et al. Monitoring streams A new class of data management applications// Proceedings of the 28th International Conference on Very Large Data Bases (VLDB2002). Hong Kong, China, 2002: 215-226.
  • 6Chandrasekaran S, Cooper O, Deshpande A, et al. TelegraphCQ: Continuous dataflow processing for an uncertain world//Pruceedings of the 1st Biennial Conference on Innovative Data Systems Research (CIDR 2003). Asilomar, USA, 2003:269-280.
  • 7Arasu A, Babcock B, Babu S, et al. STREAM: The stanford stream data manager. IEEE Data Engineering Bulletin, 2003, 26(1): 19-26.
  • 8Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters//Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI 2004). San Francisco, USA, 2004:137-150.
  • 9Li Feng, Ooi B C, Ozsu M T, Wu S. Distributed data management using MapReduce. ACM Computing Surveys, 2014, 46(3): 31:1-31:42.
  • 10Neumeyer L, Robbins B, Nair A, Kesari A. S4: Distributed stream computing platform//Proceedings of the 2010 Industrial Conference on Data Mining Workshops (ICDM2010). Berlin, Germany, 2010:170-177.

共引文献316

同被引文献17

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部