摘要
介绍了笔者研发的一种基于统一数据模型和扩展数据流模型实现的插件化数据交换和集成工具DataTurbo,它以示例驱动的界面引导用户将可配置的功能插件快速、灵活地组合构成数据流程,实现自动、稳健和高效的数据物化集成。统一数据模型降低了以往ETL工具使用中由数据存储格式和语义差异造成的复杂性,同时提高了插件和工具的可扩展性。扩展数据流模型支持流程事务的定义和基于共享状态的异步事件响应,前者通过模型变换,为流程添加易于理解的控制信息;后者允许系统快速响应异常事件。DataTurbo已经成功部署并服务于广州市番禺区、南沙区数据中心。
This paper introduced a novel pluggable tool, DataTurbo, developed by the authors for data exchange and integration which was based on a unified data model and extended data flow model. It utilized an example-driven user interface to help build configurable and flexible data flow models quickly out of eomposeable functional plug-ins, to implement automatic, robust and efficient materialized data integration. The unified data model alleviated the problem of complex data flow defini- tions caused by differences in data formats and semantics, and also improved the extensibility of the plug-ins and tools. The extended data flow model supported flow transactions and asynchronous event handling based on shared states. The former extension added understandable control information to a flow through model transformation and the latter enabled quick responses to exceptional events. DataTurbo had been deployed and serving in Panyu and Nansha data centers in Guangzhou successfully.
出处
《计算机应用研究》
CSCD
北大核心
2009年第10期3770-3773,3777,共5页
Application Research of Computers
基金
粤港关键领域重点突破项目(2008A011400010)
国家技术创新基金资助项目(08C26214411198)
广州市创新基金资助项目(2007V41C0301)