期刊文献+
共找到2,104篇文章
< 1 2 106 >
每页显示 20 50 100
ETL Maturity Model for Data Warehouse Systems:A CMMI Compliant Framework
1
作者 Musawwer Khan Islam Ali +6 位作者 Shahzada Khurram Salman Naseer Shafiq Ahmad Ahmed T.Soliman Akber Abid Gardezi Muhammad Shafiq Jin-Ghoo Choi 《Computers, Materials & Continua》 SCIE EI 2023年第2期3849-3863,共15页
The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesir... The effectiveness of the Business Intelligence(BI)system mainly depends on the quality of knowledge it produces.The decision-making process is hindered,and the user’s trust is lost,if the knowledge offered is undesired or of poor quality.A Data Warehouse(DW)is a huge collection of data gathered from many sources and an important part of any BI solution to assist management in making better decisions.The Extract,Transform,and Load(ETL)process is the backbone of a DW system,and it is responsible for moving data from source systems into the DW system.The more mature the ETL process the more reliable the DW system.In this paper,we propose the ETL Maturity Model(EMM)that assists organizations in achieving a high-quality ETL system and thereby enhancing the quality of knowledge produced.The EMM is made up of five levels of maturity i.e.,Chaotic,Acceptable,Stable,Efficient and Reliable.Each level of maturity contains Key Process Areas(KPAs)that have been endorsed by industry experts and include all critical features of a good ETL system.Quality Objectives(QOs)are defined procedures that,when implemented,resulted in a high-quality ETL process.Each KPA has its own set of QOs,the execution of which meets the requirements of that KPA.Multiple brainstorming sessions with relevant industry experts helped to enhance the model.EMMwas deployed in two key projects utilizing multiple case studies to supplement the validation process and support our claim.This model can assist organizations in improving their current ETL process and transforming it into a more mature ETL system.This model can also provide high-quality information to assist users inmaking better decisions and gaining their trust. 展开更多
关键词 etl maturity model CMMI data warehouse maturity model
下载PDF
Correlation knowledge extraction based on data mining for distribution network planning
2
作者 Zhifang Zhu Zihan Lin +4 位作者 Liping Chen Hong Dong Yanna Gao Xinyi Liang Jiahao Deng 《Global Energy Interconnection》 EI CSCD 2023年第4期485-492,共8页
Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.Th... Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme. 展开更多
关键词 Distribution network planning data mining Apriori algorithm Gray correlation analysis Chi-square test
下载PDF
MA-IDS: A Distributed Intrusion Detection System Based on Data Mining
3
作者 SUNJian-hua JINHai CHENHao HANZong-fen 《Wuhan University Journal of Natural Sciences》 CAS 2005年第1期111-114,共4页
Aiming at the shortcomings in intrusion detection systems (IDSs) used incommercial and research fields, we propose the MA-IDS system, a distributed intrusion detectionsystem based on data mining. In this model, misuse... Aiming at the shortcomings in intrusion detection systems (IDSs) used incommercial and research fields, we propose the MA-IDS system, a distributed intrusion detectionsystem based on data mining. In this model, misuse intrusion detection system CM1DS) and anomalyintrusion de-lection system (AIDS) are combined. Data mining is applied to raise detectionperformance, and distributed mechanism is employed to increase the scalability and efficiency. Host-and network-based mining algorithms employ an improved. Bayes-ian decision theorem that suits forreal security environment to minimize the risks incurred by false decisions. We describe the overallarchitecture of the MA-IDS system, and discuss specific design and implementation issue. 展开更多
关键词 intrusion detection data mining distributed system
下载PDF
Designing a Model to Study Data Mining in Distributed Environment
4
作者 Md. Abadur Rahman Masud Karim 《Journal of Data Analysis and Information Processing》 2021年第1期23-29,共7页
To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using d... To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using data mining to perform such tasks. Data mining techniques are used to find hidden information from large data source. Data mining is using for various fields: Artificial intelligence, Bank, health and medical, corruption, legal issues, corporate business, marketing, etc. Special interest is given to associate rules, data mining algorithms, decision tree and distributed approach. Data is becoming larger and spreading geographically. So it is difficult to find better result from only a central data source. For knowledge discovery, we have to work with distributed database. On the other hand, security and privacy considerations are also another factor for de-motivation of working with centralized data. For this reason, distributed database is essential for future processing. In this paper, we have proposed a framework to study data mining in distributed environment. The paper presents a framework to bring out actionable knowledge. We have shown some level by which we can generate actionable knowledge. Possible tools and technique for these levels are discussed. 展开更多
关键词 data mining distributed database Knowledge Discovery Classification Algorithm
下载PDF
Teradata数据仓库的ETL在电信行业中的设计与实施 被引量:2
5
作者 张琴和 李民 《机械设计与制造工程》 2012年第A07期10-13,17,共5页
介绍了Teradata数据仓库和ETL的相关概念,结合电信行业中对数据仓库的ETL流程的要求和特点,对Teradata数据仓库的ETL工具进行模型设计与研究,设计了一套普遍适合电信行业数据仓库的ETL框架模型,并对该模型进行实施,实施结果证明该模型... 介绍了Teradata数据仓库和ETL的相关概念,结合电信行业中对数据仓库的ETL流程的要求和特点,对Teradata数据仓库的ETL工具进行模型设计与研究,设计了一套普遍适合电信行业数据仓库的ETL框架模型,并对该模型进行实施,实施结果证明该模型可行。 展开更多
关键词 数据仓库 TERAdata etl etl工具
下载PDF
A Distributed Data Mining System Based on Multi-agent Technology 被引量:1
6
作者 郭黎明 张艳珍 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期80-83,共4页
Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data... Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data mining have been developed, but only a few of them make use of intelligent agents. This paper provides the reason for applying Multi-Agent Technology in Distributed Data Mining and presents a Distributed Data Mining System based on Multi-Agent Technology that deals with heterogeneity in such environment. Based on the advantages of both the CS model and agent-based model, the system is being able to address the specific concern of increasing scalability and enhancing performance. 展开更多
关键词 分布式数据挖掘算法 多代理技术 服务器 计算机技术 信息处理
下载PDF
Research and Application of Distributed Data Mining Method for Improving Rural Power Grid Enterprises in Production and Operation Status Evaluation
7
作者 Gao Xiu-yun Xiang Wen Fang Jun-long 《Journal of Northeast Agricultural University(English Edition)》 CAS 2019年第2期87-96,共10页
With the reform of rural network enterprise system,the speed of transfer property rights in rural power enterprises is accelerated.The evaluation of the operation and development status of rural power enterprises is d... With the reform of rural network enterprise system,the speed of transfer property rights in rural power enterprises is accelerated.The evaluation of the operation and development status of rural power enterprises is directly related to the future development and investment direction of rural power enterprises.At present,the evaluation of the production and operation of rural network enterprises and the development status of power network only relies on the experience of the evaluation personnel,sets the reference index,and forms the evaluation results through artificial scoring.Due to the strong subjective consciousness of the evaluation results,the practical guiding significance is weak.Therefore,distributed data mining method in rural power enterprises status evaluation was proposed which had been applied in many fields,such as food science,economy or chemical industry.The distributed mathematical model was established by using principal component analysis(PCA)and regression analysis.By screening various technical indicators and determining their relevance,the reference value of evaluation results was improved.Combined with statistical program for social sciences(SPSS)data analysis software,the operation status of rural network enterprises was evaluated,and the rationality,effectiveness and economy of the evaluation was verified through comparison with current evaluation results and calculation examples of actual grid operation data. 展开更多
关键词 RURAL power grid PRODUCTION and management distributed data mining STATISTICAL program for SOCIAL sciences(SPSS19)
下载PDF
Refreshing File Aggregate of Distributed Data Warehouse in Sets of Electric Apparatus
8
作者 于宝琴 王太勇 +3 位作者 张君 周明 何改云 李国琴 《Transactions of Tianjin University》 EI CAS 2006年第3期174-179,共6页
集成异构的数据来源是一个前提为企业分享数据。更新的高度有效的数据能两个都保存系统开销,并且提供即时数据,在数据仓库的预处理区域很快修改数据是热问题之一。装载设计的摘录变换基于一根新数据算法 calledDiff 火柴被建议,它被... 集成异构的数据来源是一个前提为企业分享数据。更新的高度有效的数据能两个都保存系统开销,并且提供即时数据,在数据仓库的预处理区域很快修改数据是热问题之一。装载设计的摘录变换基于一根新数据算法 calledDiff 火柴被建议,它被利用模式匹配和过滤数据的技术开发。它能加速数据更新,过滤异构的数据,并且搜寻数据的不同集合。Itsefficiency 被它的成功的应用程序在电的仪器组的一家企业证明了。 展开更多
关键词 分布式 数据仓库 KMP算法 电气设备
下载PDF
基于Teradata应用工具的ETL策略设计与实践 被引量:2
9
作者 戴邵红 古春笑 权毅 《机械工程与自动化》 2009年第1期162-163,166,共3页
ETL是构建数据仓库的重要环节。介绍了数据仓库和ETL的概念;并针对Teradata数据仓库应用工具,讨论了基于ETL Automation这个ETL进程调度工具,设计了实现ETL的具体策略。
关键词 数据仓库 TERAdata etl etl AUTOMATION
下载PDF
抢先赢得商机的Data Mining──基于数据仓库的数据挖掘技术 被引量:2
10
作者 王春梅 王曙燕 《现代电子技术》 2006年第12期98-100,共3页
首先介绍了数据仓库以及在此技术上产生的数据挖掘技术,其次阐述了实现数据挖掘应用的几种工具以及选用工具时应遵循的原则,最后说明了数据挖掘技术现存的问题及他现在重要的商业地位。
关键词 数据仓库(DW) 数据挖掘 联机分析处理(OLAP) 建模
下载PDF
Research on Rolling Load Distribution Method based on Data Mining 被引量:1
11
作者 ZHANG Yan-hua LIU Xiang-hua WANG Guo-dong 《Journal of Iron and Steel Research(International)》 SCIE CAS CSCD 2005年第6期30-32,53,共4页
A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with usin... A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with using knowledge discovery in database (KDD) and data mining (DM) as the start. The online maintenance and optimization of the load model are realized. The effectiveness of this new method was testified by offline simulation and online application. 展开更多
关键词 rolling load distribution information processing knowledge discovery data mining
下载PDF
基于大数据ETL引擎的批量智能开发平台研究
12
作者 曾国文 梁华生 钟玲 《电信工程技术与标准化》 2024年第3期20-25,共6页
大数据时代,为了能更好提升多样化源数据类型情况下的ETL开发效率,本文提出了一种基于大数据ETL引擎的批量智能开发平台,重构ETL核心代码完善组件功能,用Java代码自研发class方法,智能调度API接口,批量生成全删全插、增量同步、拉链表... 大数据时代,为了能更好提升多样化源数据类型情况下的ETL开发效率,本文提出了一种基于大数据ETL引擎的批量智能开发平台,重构ETL核心代码完善组件功能,用Java代码自研发class方法,智能调度API接口,批量生成全删全插、增量同步、拉链表和数据质量稽核等XML脚本,转译ETL的XML代码为可执行Java代码,降低操作员使用平台的技术难度。通过实操验证,本文提出的方法能增加平台的开发效率,更快速实现需求。 展开更多
关键词 数据仓库 etl引擎 XML脚本 批量开发
下载PDF
Data Mining Techniques and its Uses in Different Fields: A Review Paper
13
作者 Gaurav Dhawan 《Journal of Electronic Research and Application》 2018年第4期1-4,共4页
The paper introduced the data mining and issues related to it.Data mining is a technique by which we can extract useful knowledge from urge set of data.Data mining tasks used to perform various operations and used to ... The paper introduced the data mining and issues related to it.Data mining is a technique by which we can extract useful knowledge from urge set of data.Data mining tasks used to perform various operations and used to solve various problems related to data mining.Data warehouse is the collection of different method and techniques used to extract useful information from raw data.Genetic algorithm is based on Darwin’s theory in which low standard chromosomes are removed from the population due to their inability to survive the process of selection.The high standard chromosomes survive and are mixed by recombination to form more appropriate individuals.In this urge amount of data is used to predict future result by following several steps. 展开更多
关键词 data mining data warehouse GENETIC algorithm chromosomes
下载PDF
增量式ETL工具的研究与实现 被引量:20
14
作者 章水鑫 徐宏炳 于立 《现代计算机》 2005年第3期6-10,共5页
利用数据源的增量数据对数据仓库进行维护,可以有效提高ETL效率。现有通用ETL工具在增量抽取方面存在一些问题,如不能抽取多个异构数据源的增量数据以及在处理增量数据时造成数据丢失的异常问题。本文从实践角度设计的增量式ETL工具采... 利用数据源的增量数据对数据仓库进行维护,可以有效提高ETL效率。现有通用ETL工具在增量抽取方面存在一些问题,如不能抽取多个异构数据源的增量数据以及在处理增量数据时造成数据丢失的异常问题。本文从实践角度设计的增量式ETL工具采用集成多种增量数据捕获方式,解决异构数据源在捕获增量数据上的差异;在数据处理过程中,通过辅助表的手段解决了数据丢失的问题。在文章的最后,还介绍了ETL过程中数据转换和转换调度的实现。 展开更多
关键词 增量式etl工具 数据源 数据仓库 etl效率 SEUetl工具 增量数据
下载PDF
An Adaptive Privacy Preserving Framework for Distributed Association Rule Mining in Healthcare Databases
15
作者 Hasanien K.Kuba Mustafa A.Azzawi +2 位作者 Saad M.Darwish Oday A.Hassen Ansam A.Abdulhussein 《Computers, Materials & Continua》 SCIE EI 2023年第2期4119-4133,共15页
It is crucial,while using healthcare data,to assess the advantages of data privacy against the possible drawbacks.Data from several sources must be combined for use in many data mining applications.The medical practit... It is crucial,while using healthcare data,to assess the advantages of data privacy against the possible drawbacks.Data from several sources must be combined for use in many data mining applications.The medical practitioner may use the results of association rule mining performed on this aggregated data to better personalize patient care and implement preventive measures.Historically,numerous heuristics(e.g.,greedy search)and metaheuristics-based techniques(e.g.,evolutionary algorithm)have been created for the positive association rule in privacy preserving data mining(PPDM).When it comes to connecting seemingly unrelated diseases and drugs,negative association rules may be more informative than their positive counterparts.It is well-known that during negative association rules mining,a large number of uninteresting rules are formed,making this a difficult problem to tackle.In this research,we offer an adaptive method for negative association rule mining in vertically partitioned healthcare datasets that respects users’privacy.The applied approach dynamically determines the transactions to be interrupted for information hiding,as opposed to predefining them.This study introduces a novel method for addressing the problem of negative association rules in healthcare data mining,one that is based on the Tabu-genetic optimization paradigm.Tabu search is advantageous since it removes a huge number of unnecessary rules and item sets.Experiments using benchmark healthcare datasets prove that the discussed scheme outperforms state-of-the-art solutions in terms of decreasing side effects and data distortions,as measured by the indicator of hiding failure. 展开更多
关键词 distributed data mining evolutionary computation sanitization process healthcare informatics
下载PDF
基于数据仓库ETL技术的电力技改大修项目数据清洗方法 被引量:1
16
作者 沈海天 嵇惠方 +2 位作者 游睿 唐梁 谢晓锋 《电工技术》 2023年第14期177-179,共3页
由于重复数据和空缺数据数量多,电力技改大修项目数据清洗方法不能实现脏数据的有效清洗,为此研究基于数据仓库ETL技术的电力技改大修项目数据清洗方法。通过对多数据源的电力技改大修项目脏数据质量的评估,判断符合预期标准后进行数据... 由于重复数据和空缺数据数量多,电力技改大修项目数据清洗方法不能实现脏数据的有效清洗,为此研究基于数据仓库ETL技术的电力技改大修项目数据清洗方法。通过对多数据源的电力技改大修项目脏数据质量的评估,判断符合预期标准后进行数据挖掘;结合数据仓库ETL技术对重复数据记录进行清洗;运用切比雪夫定理处理电力技改大修项目数据空缺值来完成对电力技改大修项目数据的有效清洗。实验结果表明,运用该方法清洗数据有效率最高,有效提高了数据的质量,实现了对数据的高质量清洗。 展开更多
关键词 数据仓库 etl技术 数据清洗
下载PDF
面向数据集成的ETL系统设计与实现 被引量:21
17
作者 钟华 冯文澜 +1 位作者 谭红星 黄涛 《计算机科学》 CSCD 北大核心 2004年第9期87-89,F004,共4页
ETL是一类用于从一个或多个业务数据库中抽取数据,进行清理转换并加截到数据仓库中的工具。这个数据抽取、转换和加载的过程能够很好地应用于数据集成领域中,实现不同机构之间数据的交换与整合。通过分析数据集成的一些特点,我们提出了... ETL是一类用于从一个或多个业务数据库中抽取数据,进行清理转换并加截到数据仓库中的工具。这个数据抽取、转换和加载的过程能够很好地应用于数据集成领域中,实现不同机构之间数据的交换与整合。通过分析数据集成的一些特点,我们提出了一个ETL过程模型,开发了一个面向数据集成的ETL系统DataIntegrator。本文对ETL过程模型、系统总体结构及若干关键技术进行论述。DataIntegrator已经应用于信息系统的建设中,为企业应用集成提供了很好的支持。 展开更多
关键词 etl 数据集成 过程模型 数据抽取 企业应用集成 数据仓库 转换 系统总体结构 业务数据 信息系统
下载PDF
数据仓库中ETL技术的研究 被引量:116
18
作者 张宁 贾自艳 史忠植 《计算机工程与应用》 CSCD 北大核心 2002年第24期213-216,共4页
作为数据仓库的关键部件,支持数据抽取、清洗、转换和装载的工具集对任何数据仓库工程都是一个必不可少的成功因素。该文简单介绍了ETL技术,包括ETL的相关概念、ETL在数据仓库中的功能和重要地位以及现有的研究成果,然后重点介绍了ETL... 作为数据仓库的关键部件,支持数据抽取、清洗、转换和装载的工具集对任何数据仓库工程都是一个必不可少的成功因素。该文简单介绍了ETL技术,包括ETL的相关概念、ETL在数据仓库中的功能和重要地位以及现有的研究成果,然后重点介绍了ETL的具体设计和实现方法。 展开更多
关键词 数据仓库 etl 数据库 数据模型 数据抽取 数据转换 数据清洗 数据装载
下载PDF
面向数据质量的ETL过程建模与实现 被引量:23
19
作者 贾自艳 黄友平 +3 位作者 罗平 李嘉佑 秦亮曦 史忠植 《系统仿真学报》 CAS CSCD 2004年第5期907-911,914,共6页
为了给数据仓库提供高质量的数据,在数据装载到数据仓库之前必须经过数据的抽取-转换-装载(Extraction-Transformation-loading,ETL)这一系列的预处理工作。复杂性和可用性是制约ETL系统的两大基本问题。为解决这些问题,给出了ETL过程... 为了给数据仓库提供高质量的数据,在数据装载到数据仓库之前必须经过数据的抽取-转换-装载(Extraction-Transformation-loading,ETL)这一系列的预处理工作。复杂性和可用性是制约ETL系统的两大基本问题。为解决这些问题,给出了ETL过程统一的体系结构设计,包括ETL元数据对象建模、ETL转换函数设计、ETL任务建模以及ETL任务模型的描述语言(XTDL)。基于该体系结构和设计思想开发出一个ETL系统—MSETL,目的是为多策略数据挖掘平台(MSMiner)提供高质量的数据。它提供友好界面并对ETL过程进行统一的元数据管理,包括:ETL转换函数的注册和删除;任务模型的生成、执行和删除等功能。 展开更多
关键词 数据仓库 数据质量 抽取-转换-装载(etl) 数据挖掘 数据清洗
下载PDF
数据仓库中ETL技术的探讨与实践 被引量:31
20
作者 王克龙 王玲 +1 位作者 王平立 宋斌 《计算机应用与软件》 CSCD 北大核心 2005年第11期30-31,78,共3页
支持数据抽取、转换、清洗和装载的工具集对任何数据仓库工程都是一个必不可少的关键部件。本文重点探讨了ETL技术以及ETL工具的选择原则,并结合具体实例详细介绍了ETL过程的设计和实现方法。
关键词 数据仓库 etl技术 数据抽取 数据挖掘 联机分析 可靠性
下载PDF
上一页 1 2 106 下一页 到第
使用帮助 返回顶部