摘要
Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify data rapidly in the pre-processing area of the data warehouse. An extract transform loading design is proposed based on a new data algorithm called Diff-Match,which is developed by utilizing mode matching and data-filtering technology. It can accelerate data renewal, filter the heterogeneous data, and seek out different sets of data. Its efficiency has been proved by its successful application in an enterprise of electric apparatus groups.
集成异构的数据来源是一个前提为企业分享数据。更新的高度有效的数据能两个都保存系统开销,并且提供即时数据,在数据仓库的预处理区域很快修改数据是热问题之一。装载设计的摘录变换基于一根新数据算法 calledDiff 火柴被建议,它被利用模式匹配和过滤数据的技术开发。它能加速数据更新,过滤异构的数据,并且搜寻数据的不同集合。Itsefficiency 被它的成功的应用程序在电的仪器组的一家企业证明了。
基金
Supported by National Natural Science Foundation of China (No. 50475117)Tianjin Natural Science Foundation (No.06YFJMJC03700).