期刊文献+
共找到34篇文章
< 1 2 >
每页显示 20 50 100
ETL Maturity Model for Data Warehouse Systems:A CMMI Compliant Framework
1
作者 Musawwer Khan Islam Ali +6 位作者 Shahzada Khurram Salman Naseer Shafiq Ahmad Ahmed T.Soliman Akber Abid Gardezi Muhammad Shafiq Jin-Ghoo Choi 《Computers, Materials & Continua》 SCIE EI 2023年第2期3849-3863,共15页
The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesir... The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesired or of poor quality.A Data Warehouse(DW)is a huge collection of data gathered from many sources and an important part of any BI solution to assist management in making better decisions.The Extract,Transform,and Load(ETL)process is the backbone of a DW system,and it is responsible for moving data from source systems into the DW system.The more mature the ETL process the more reliable the DW system.In this paper,we propose the ETL Maturity Model(EMM)that assists organizations in achieving a high-quality ETL system and thereby enhancing the quality of knowledge produced.The EMM is made up of five levels of maturity i.e.,Chaotic,Acceptable,Stable,Efficient and Reliable.Each level of maturity contains Key Process Areas(KPAs)that have been endorsed by industry experts and include all critical features of a good ETL system.Quality Objectives(QOs)are defined procedures that,when implemented,resulted in a high-quality ETL process.Each KPA has its own set of QOs,the execution of which meets the requirements of that KPA.Multiple brainstorming sessions with relevant industry experts helped to enhance the model.EMMwas deployed in two key projects utilizing multiple case studies to supplement the validation process and support our claim.This model can assist organizations in improving their current ETL process and transforming it into a more mature ETL system.This model can also provide high-quality information to assist users inmaking better decisions and gaining their trust. 展开更多
关键词 ETL maturity model CMMI data warehouse maturity model
下载PDF
Decision-Making Information System for Academic Careers in Congolese Universities: From Analysis to Design of a Data Warehouse
2
作者 Boribo Kikunda Philippe Thierry Nsabimana +3 位作者 Longin Ndayisaba Jules Raymond Kala Jérémie Ndikumagenge Elie Zihindula Mushengezi 《Open Journal of Applied Sciences》 2023年第12期2395-2407,共13页
Universities collect and generate a considerable amount of data on students throughout their academic career. Currently in South Kivu, most universities have an information system in the form of a database made up of ... Universities collect and generate a considerable amount of data on students throughout their academic career. Currently in South Kivu, most universities have an information system in the form of a database made up of several disparate files. This makes it difficult to use this data efficiently and profitably. The aim of this study is to develop this transactional database-based information system into a data warehouse-oriented system. This tool will be able to collect, organize and archive data on the student’s career path, year after year, and transform it for analysis purposes. In the age of Big Data, a number of artificial intelligence techniques have been developed, making it possible to extract useful information from large databases. This extracted information is of paramount importance in decision-making. By way of example, the information extracted by these techniques can be used to predict which stream a student should choose when applying to university. In order to develop our contribution, we analyzed the IT information systems used in the various universities and applied the bottom-up method to design our data warehouse model. We used the relational model to design the data warehouse. 展开更多
关键词 data warehouse University Courses Universities of South Kivu
下载PDF
Development of Geological Data Warehouse 被引量:2
3
作者 LiZhenhua HuGuangdao ZhangZhenfei 《Journal of China University of Geosciences》 SCIE CSCD 2003年第3期261-264,共4页
Data warehouse (DW), a new technology invented in 1990s, is more useful for integrating and analyzing massive data than traditional database. Its application in geology field can be divided into 3 phrases: 1992-1996,... Data warehouse (DW), a new technology invented in 1990s, is more useful for integrating and analyzing massive data than traditional database. Its application in geology field can be divided into 3 phrases: 1992-1996, commercial data warehouse (CDW) appeared; 1996-1999, geological data warehouse (GDW) appeared and the geologists or geographers realized the importance of DW and began the studies on it, but the practical DW still followed the framework of DB; 2000 to present, geological data warehouse grows, and the theory of geo-spatial data warehouse (GSDW) has been developed but the research in geological area is still deficient except that in geography. Although some developments of GDW have been made, its core still follows the CDW-organizing data by time and brings about 3 problems: difficult to integrate the geological data, for the data feature more space than time; hard to store the massive data in different levels due to the same reason; hardly support the spatial analysis if the data are organized by time as CDW does. So the GDW should be redesigned by organizing data by scale in order to store mass data in different levels and synthesize the data in different granularities, and choosing space control points to replace the former time control points so as to integrate different types of data by the method of storing one type data as one layer and then to superpose the layers. In addition, data cube, a wide used technology in CDW, will be no use in GDW, for the causality among the geological data is not so obvious as commercial data, as the data are the mixed result of many complex rules, and their analysis always needs the special geological methods and software; on the other hand, data cube for mass and complex geo-data will devour too much store space to be practical. On this point, the main purpose of GDW may be fit for data integration unlike CDW for data analysis. 展开更多
关键词 data warehouse (DW) geological data warehouse (GDW) space control points data cube
下载PDF
Data Warehouse Design for Big Data in Academia
4
作者 Alex Rudniy 《Computers, Materials & Continua》 SCIE EI 2022年第4期979-992,共14页
This paper describes the process of design and construction of a data warehouse(“DW”)for an online learning platform using three prominent technologies,Microsoft SQL Server,MongoDB and Apache Hive.The three systems ... This paper describes the process of design and construction of a data warehouse(“DW”)for an online learning platform using three prominent technologies,Microsoft SQL Server,MongoDB and Apache Hive.The three systems are evaluated for corpus construction and descriptive analytics.The case also demonstrates the value of evidence-centered design principles for data warehouse design that is sustainable enough to adapt to the demands of handling big data in a variety of contexts.Additionally,the paper addresses maintainability-performance tradeoff,storage considerations and accessibility of big data corpora.In this NSF-sponsored work,the data were processed,transformed,and stored in the three versions of a data warehouse in search for a better performing and more suitable platform.The data warehouse engines-a relational database,a No-SQL database,and a big data technology for parallel computations-were subjected to principled analysis.Design,construction and evaluation of a data warehouse were scrutinized to find improved ways of storing,organizing and extracting information.The work also examines building corpora,performing ad-hoc extractions,and ensuring confidentiality.It was found that Apache Hive demonstrated the best processing time followed by SQL Server and MongoDB.In the aspect of analytical queries,the SQL Server was a top performer followed by MongoDB and Hive.This paper also discusses a novel process for render students anonymity complying with Family Educational Rights and Privacy Act regulations.Five phases for DW design are recommended:1)Establishing goals at the outset based on Evidence-Centered Design principles;2)Recognizing the unique demands of student data and use;3)Adopting a model that integrates cost with technical considerations;4)Designing a comparative database and 5)Planning for a DW design that is sustainable.Recommendations for future research include attempting DW design in contexts involving larger data sets,more refined operations,and ensuring attention is paid to sustainability of operations. 展开更多
关键词 Big data data warehouse MONGODB Apache hive SQL server
下载PDF
Optimal Genetic View Selection Algorithm for Data Warehouse
5
作者 王自强 冯博琴 《Journal of Southwest Jiaotong University(English Edition)》 2005年第1期5-10,共6页
To efficiently solve the materialized view selection problem, an optimal genetic algorithm of how to select a set of views to be materialized is proposed so as to achieve both good query performance and low view maint... To efficiently solve the materialized view selection problem, an optimal genetic algorithm of how to select a set of views to be materialized is proposed so as to achieve both good query performance and low view maintenance cost under a storage space constraint. First, a pre-processing algorithm based on the maximum benefit per unit space is used to generate initial solutions. Then, the initial solutions are improved by the genetic algorithm having the mixture of optimal strategies. Furthermore, the generated infeasible solutions during the evolution process are repaired by loss function. The experimental results show that the proposed algorithm outperforms the heuristic algorithm and canonical genetic algorithm in finding optimal solutions. 展开更多
关键词 data warehouse Genetic algorithm View selection AND-OR graph
下载PDF
Research on the Construction of a Data Warehouse Model for College Student Performance
6
作者 Juntao Chen Jinmei Zhan Fei Tian 《国际计算机前沿大会会议论文集》 EI 2023年第2期408-419,共12页
Students’grades not only serve as an effective indicator of their learning achievements but also to some extent reflect the completion of teaching tasks by the instructors.Currently,many universities across the count... Students’grades not only serve as an effective indicator of their learning achievements but also to some extent reflect the completion of teaching tasks by the instructors.Currently,many universities across the country have collected and recorded various information about students and teachers in the school’s information management system,but it is only a simple storage record and has not effectively excavated hidden information,and data have not been fully utilized.Student performance information,enrolment information,course information,teaching plans,and teacher-related information are currently stored in separate databases,which are independent of each other,making it difficult to perform effective data analysis.Data warehousing technology can integrate various information and use data analysis software to excavate more high-value information,which is convenient for teaching evaluation and optimizing teaching strategies.Based on data warehousing technology,the article uses the hierarchical concept of data warehousing to construct the ODS layer,DWD layer,DWS layer and ETL layer.Facing the data warehousing topic,the article designs the data warehousing conceptual model,logical model,and physical model based on student performance,providing a model basis for later data mining. 展开更多
关键词 Student Performance data warehouse Model Construction
原文传递
A study on building data warehouse of hospital information system 被引量:10
7
作者 LI Ping WU Tao +2 位作者 CHEN Mu ZHOU Bin XU Wei-guo 《Chinese Medical Journal》 SCIE CAS CSCD 2011年第15期2372-2377,共6页
Background Existing hospital information systems with simple statistical functions cannot meet current management needs. It is well known that hospital resources are distributed with private property rights among hosp... Background Existing hospital information systems with simple statistical functions cannot meet current management needs. It is well known that hospital resources are distributed with private property rights among hospitals, such as in the case of the regional coordination of medical services. In this study, to integrate and make full use of medical data effectively, we propose a data warehouse modeling method for the hospital information system. The method can also be employed for a distributed-hospital medical service system. Methods To ensure that hospital information supports the diverse needs of health care, the framework of the hospital information system has three layers: datacenter layer, system-function layer, and user-interface layer. This paper discusses the role of a data warehouse management system in handling hospital information from the establishment of the data theme to the design of a data model to the establishment of a data warehouse. Online analytical processing tools assist user-friendly multidimensional analysis from a number of different angles to extract the required data and information. Results Use of the data warehouse improves online analytical processing and mitigates deficiencies in the decision support system. The hospital information system based on a data warehouse effectively employs statistical analysis and data mining technology to handle massive quantities of historical data, and summarizes from clinical and hospital information for decision making. Conclusions This paper proposes the use of a data warehouse for a hospital information system, specifically a data warehouse for the theme of hospital information to determine latitude, modeling and so on. The processing of patient information is given as an example that demonstrates the usefulness of this method in the case of hospital information management. Data warehouse technology is an evolving technology, and more and more decision support information extracted by data mining and with decision-making technology is required for further research. 展开更多
关键词 hospital information management hospital information system data warehouse online analytical processing
原文传递
Efficient query processing framework for big data warehouse: an almost join-free approach 被引量:3
8
作者 Huiju WANG Xiongpai QIN +4 位作者 Xuan ZHOU Furong LI Zuoyan QIN Qing ZHU Shan WANG 《Frontiers of Computer Science》 SCIE EI CSCD 2015年第2期224-236,共13页
The rapidly increasing scale of data warehouses is challenging today's data analytical technologies. A con- ventional data analytical platform processes data warehouse queries using a star schema -- it normalizes the... The rapidly increasing scale of data warehouses is challenging today's data analytical technologies. A con- ventional data analytical platform processes data warehouse queries using a star schema -- it normalizes the data into a fact table and a number of dimension tables, and during query processing it selectively joins the tables according to users' demands. This model is space economical. However, it faces two problems when applied to big data. First, join is an expensive operation, which prohibits a parallel database or a MapReduce-based system from achieving efficiency and scalability simultaneously. Second, join operations have to be executed repeatedly, while numerous join results can actually be reused by different queries. In this paper, we propose a new query processing frame- work for data warehouses. It pushes the join operations par- tially to the pre-processing phase and partially to the post- processing phase, so that data warehouse queries can be transformed into massive parallelized filter-aggregation oper- ations on the fact table. In contrast to the conventional query processing models, our approach is efficient, scalable and sta- ble despite of the large number of tables involved in the join. It is especially suitable for a large-scale parallel data ware- house. Our empirical evaluation on Hadoop shows that our framework exhibits linear scalability and outperforms some existing approaches by an order of magnitude. 展开更多
关键词 data warehouse large scale TAMP join-free multi-version schema
原文传递
Modelling and implementing big data warehouses for decision support 被引量:2
9
作者 Maribel Yasmina Santos Bruno Martinho Carlos Costa 《Journal of Management Analytics》 EI 2017年第2期111-129,共19页
In the era of Big Data,many NoSQL databases emerged for the storage and later processing of vast volumes of data,using data structures that can follow columnar,key-value,document or graph formats.For analytical contex... In the era of Big Data,many NoSQL databases emerged for the storage and later processing of vast volumes of data,using data structures that can follow columnar,key-value,document or graph formats.For analytical contexts,requiring a Big Data Warehouse,Hive is used as the driving force,allowing the analysis of vast amounts of data.Data models in Hive are usually defined taking into consideration the queries that need to be answered.In this work,a set of rules is presented for the transformation of multidimensional data models into Hive tables,making available data at different levels of detail.These several levels are suited for answering different queries,depending on the analytical needs.After the identification of the Hive tables,this paper summarizes a demonstration case in which the implementation of a specific Big Data architecture shows how the evolution from a traditional Data Warehouse to a Big Data Warehouse is possible. 展开更多
关键词 big data data model data warehouse hive NOSQL
原文传递
Efficient Aggregation Algorithms on Very LargeCompressed Data Warehouses 被引量:1
10
作者 李建中 李英姝 Jaideep Srivastava 《Journal of Computer Science & Technology》 SCIE EI CSCD 2000年第3期213-229,共17页
Multidimensional aggregation is a dominant operation on data ware-houses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data wa... Multidimensional aggregation is a dominant operation on data ware-houses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouseshave been developed. However, to our knowledge, there is nothing to date in theliterature about aggregation algorithms on multidimensional data warehouses thatstore datasets in multidimensional arrays rather than in tables. This paper presentsa set of multidimensional aggregation algorithms on very large and compressed mul-tidimensional data warehouses. These algorithms operate directly on compresseddatasets in multidimensional data warehouses without the need to first decompressthem. They are applicable to a variety of data compression methods. The algorithmshave differefit performance behavior as a function of dataset parameters, sizes of out-puts and main memory availability. The algorithms are described and analyzed withrespect to the I/O and CPU costs. A decision procedure to select the most efficientalgorithm, given an aggregation request, is also proposed. The analytical and ex-perimental results show that the algorithms are more efficient than the traditionalaggregation algorithms. 展开更多
关键词 OLAP AGGREGATION data warehouse
原文传递
An Application of Rough Set Theory to Modelling and Utilising Data Warehouses
11
作者 DENG Ming-rong1, YANG Jian-bo2, PAN Yun-he3 1. School of Management, Zhejiang University, Hangzhou 310028, China 2. Manchester School of Management, 1nstitute of Science and Technology, University of Manchester, M60 1QD, UK 3. Zhejiang University, Hangzho 《Journal of Systems Science and Systems Engineering》 SCIE EI CSCD 2001年第4期489-496,共8页
A data warehouse often accommodates enormous summary information in various granularities and is mainly used to support on-line analytical processing. Ideally all detailed data should be accessible by residing in some... A data warehouse often accommodates enormous summary information in various granularities and is mainly used to support on-line analytical processing. Ideally all detailed data should be accessible by residing in some legacy systems or on-line transaction processing systems. In many cases, however, data sources in computers are also kinds of summary data due to technological problems or budget limits and also because different aggregation hierarchies may need to be used among various transaction systems. In such circumstances, it is necessary to investigate how to design dimensions, which play a major role in dimensiona1 mode1 for a data warehouse, and how to estimate summary information, which is not stored in the data warehouse. In this paper, the rough set theory is applied to support the dimension design and information estimation. 展开更多
关键词 rough sets data warehouse DIMENSION
原文传递
Performance optimization of grid aggregation in spatial data warehouses
12
作者 Myoung-Ah Kang Mehdi Zaamoune +2 位作者 François Pinet Sandro Bimonte Philippe Beaune 《International Journal of Digital Earth》 SCIE EI CSCD 2015年第12期970-988,共19页
The problem of storage and querying of large volumes of spatial grids is an issue to solve.In this paper,we propose a method to optimize queries to aggregate raster grids stored in databases.In our approach,we propose... The problem of storage and querying of large volumes of spatial grids is an issue to solve.In this paper,we propose a method to optimize queries to aggregate raster grids stored in databases.In our approach,we propose to estimate the exact result rather than calculate the exact result.This approach reduces query execution time.One advantage of our method is that it does not require implementing or modifying functionalities of database management systems.Our approach is based on a new data structure and a specific model of SQL queries.Our work is applied here to relational data warehouses. 展开更多
关键词 data warehouse database modelling geographical information system
原文传递
Hybrid Warehouse Model and Solutions for Climate Data Analysis
13
作者 Hasan Hashim 《Journal of Computer and Communications》 2020年第10期75-98,共24页
Recently, due to the rapid growth increment of data sensors, a massive volume of data is generated from different sources. The way of administering such data in a sense storing, managing, analyzing, and extracting ins... Recently, due to the rapid growth increment of data sensors, a massive volume of data is generated from different sources. The way of administering such data in a sense storing, managing, analyzing, and extracting insightful information from the massive volume of data is a challenging task. Big data analytics is becoming a vital research area in domains such as climate data analysis which demands fast access to data. Nowadays, an open-source platform namely MapReduce which is a distributed computing framework is widely used in many domains of big data analysis. In our work, we have developed a conceptual framework of data modeling essentially useful for the implementation of a hybrid data warehouse model to store the features of National Climatic Data Center (NCDC) climate data. The hybrid data warehouse model for climate big data enables for the identification of weather patterns that would be applicable in agricultural and other similar climate change-related studies that will play a major role in recommending actions to be taken by domain experts and make contingency plans over extreme cases of weather variability. 展开更多
关键词 data warehouse HADOOP NCDC data Set WEATHER
下载PDF
Hierarchical Datacubes
14
作者 Mickaël Martin Nevot Sébastien Nedjar Lotfi Lakhal 《Journal of Computer and Communications》 2023年第6期43-72,共30页
Many approaches have been proposed to pre-compute data cubes in order to efficiently respond to OLAP queries in data warehouses. However, few have proposed solutions integrating all of the possible outcomes, and it is... Many approaches have been proposed to pre-compute data cubes in order to efficiently respond to OLAP queries in data warehouses. However, few have proposed solutions integrating all of the possible outcomes, and it is this idea that leads the integration of hierarchical dimensions into these responses. To meet this need, we propose, in this paper, a complete redefinition of the framework and the formal definition of traditional database analysis through the prism of hierarchical dimensions. After characterizing the hierarchical data cube lattice, we introduce the hierarchical data cube and its most concise reduced representation, the closed hierarchical data cube. It offers compact replication so as to optimize storage space by removing redundancies of strongly correlated data. Such data are typical of data warehouses, and in particular in video games, our field of study and experimentation, where hierarchical dimension attributes are widely represented. 展开更多
关键词 ROLAP Cubing data warehouse datacube Big data Business Intelligence Hierarchical Cube Hierarchical Dimensions
下载PDF
Constructing a raster-based spatio-temporal hierarchical data model for marine fisheries application 被引量:2
15
作者 SU Fenzhen ZHOU Chenhu ZHANG Tianyu 《Acta Oceanologica Sinica》 SCIE CAS CSCD 2006年第1期57-63,共7页
Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently... Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently, greater emphasis has been placed on GIS (geographical information system)to deal with the marine information. The GIS has shown great success for terrestrial applications in the last decades, but its use in marine fields has been far more restricted. One of the main reasons is that most of the GIS systems or their data models are designed for land applications. They cannot do well with the nature of the marine environment and for the marine information. And this becomes a fundamental challenge to the traditional GIS and its data structure. This work designed a data model, the raster-based spatio-temporal hierarchical data model (RSHDM), for the marine information system, or for the knowledge discovery fi'om spatio-temporal data, which bases itself on the nature of the marine data and overcomes the shortages of the current spatio-temporal models when they are used in the field. As an experiment, the marine fishery data warehouse (FDW) for marine fishery management was set up, which was based on the RSHDM. The experiment proved that the RSHDM can do well with the data and can extract easily the aggregations that the management needs at different levels. 展开更多
关键词 marine geographical information system spatio-temporal data model knowledge discovery fishery management data warehouse
下载PDF
Multi-Dimensional Customer Data Analysis in Online Auctions
16
作者 LAO Guoling XIONG Kuan QIN Zheng 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期793-798,共6页
In this paper, we designed a customer-centered data warehouse system with five subjects: listing, bidding, transaction, accounts, and customer contact based on the business process of online auction companies. For ea... In this paper, we designed a customer-centered data warehouse system with five subjects: listing, bidding, transaction, accounts, and customer contact based on the business process of online auction companies. For each subject, we analyzed its fact indexes and dimensions. Then take transaction subject as example, analyzed the data warehouse model in detail, and got the multi-dimensional analysis structure of transaction subject. At last, using data mining to do customer segmentation, we divided customers into four types: impulse customer, prudent customer, potential customer, and ordinary customer. By the result of multi-dimensional customer data analysis, online auction companies can do more target marketing and increase customer loyalty. 展开更多
关键词 online auction data warehouse online analytic process (OLAP) data mining E-COMMERCE
下载PDF
Constructing a data platform for surface defect management using a multidimensional database
17
作者 SU Yicai OU Peng GAO Wenwu 《Baosteel Technical Research》 CAS 2013年第4期16-20,共5页
Surface quality has been one of the key factors influencing the ongoing improvement of the quality of steel. Therefore,it is urgent to provide methods for efficient supervision of surface defects. This paper first exp... Surface quality has been one of the key factors influencing the ongoing improvement of the quality of steel. Therefore,it is urgent to provide methods for efficient supervision of surface defects. This paper first expressed the main problems existing in defect management and then focused on constructing a data platform of surface defect management using a multidimensional database. Finally, some onqine applications of the platform at Baosteel were demonstrated. Results show that the constructed multidimensional database provides more structured defect data, and thus it is suitable for swift and multi-angle analysis of the defect data. 展开更多
关键词 surface defect multidimensional database data warehouse on-line analysis
下载PDF
Metallic Mineral Resources Assessment and Analysis System Design 被引量:3
18
作者 Hu Guangdao Chen Jianguo Chen Shouyu Institute of Mathematical Geology and Remote Sensing, China University of Geosciences, Wuhan 430074, China 《Journal of Earth Science》 SCIE CAS CSCD 2000年第3期114-117,共4页
This paper presents the aim and the design structure of the metallic mineral resources assessment and analysis system. This system adopts an integrated technique of data warehouse composed of affairs processing layer... This paper presents the aim and the design structure of the metallic mineral resources assessment and analysis system. This system adopts an integrated technique of data warehouse composed of affairs processing layer and analysis application layer. The affairs processing layer includes multiform databases (such as geological database, geophysical database, geochemical database), while the analysis application layer includes data warehouse, online analysis processing and data mining. This paper also presents in detail the data warehouse of the present system and the appropriate spatial analysis methods and models. Finally, this paper presents the prospect of the system. 展开更多
关键词 mineral resources assessment data warehouse spatial analysis.
下载PDF
Applying XML for designing and interchanging information for multidimensional model 被引量:2
19
作者 Lu Changhui Deng Su Zhang Weiming 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2005年第4期823-830,共8页
In order to exchange and share information among the conceptual models of data warehouse, and to build a solid base for the integration and share of metadata, a new multidimensional concept model is presented based on... In order to exchange and share information among the conceptual models of data warehouse, and to build a solid base for the integration and share of metadata, a new multidimensional concept model is presented based on XML and its DTD is defined, which can perfectly describe various semantic characteristics of multidimensional conceptual model. According to the multidimensional conceptual modeling technique which is based on UML, the mapping algorithm between the multidimensional conceptual model is described based on XML and UML class diagram, and an application base for the wide use of this technique is given. 展开更多
关键词 data warehouse OLAP multidimensional modeling conceptual model XML UML.
下载PDF
Study and Implementation of a New SQL-Based ETL Approach 被引量:2
20
作者 BAO Yubin SONG Jie LENG Fangling WANG Daling YU Ge 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期804-808,共5页
This paper analyzes the main characteristics, benefits, and disadvantages of existing traditional ETL (extraction, transformation, loading) methods, and summaries some factors affecting the performance of ETL tools.... This paper analyzes the main characteristics, benefits, and disadvantages of existing traditional ETL (extraction, transformation, loading) methods, and summaries some factors affecting the performance of ETL tools. Then, a new ETL approach, E-LT (extraction, loading and transformation), is proposed. The E-LT approach applies database mapping technique to realize that loading stage and transformation stage in the ETL process are performed at the same time after the extraction stage. Thus, it can use SQL commands to complete loading and transformation processing, and eliminates the staging area before loading in traditional ETL process. The framework of an ETL engine based on E-LT method is presented. The ETL process including initial loading and incremental refreshment is discussed in detail, and the SQL-based algorithm for initial loading is presented. The performance of E-LT method on loading throughout outperforms some commercial ETL approaches by experimental proof and theoretical analysis. At last, a real case in marine data warehousing of the E-LT method is discussed for illustrating the validity of the proposed method. 展开更多
关键词 data warehouse ETL E-LT SQL
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部