摘要
数据挖掘需要有"纯净"的数据和良好的数据组织,数据的质量直接影响到数据挖掘的效果,数据仓库从各类数据源中抽取数据,经过清洗、集成、选择、转换处理,为数据挖掘所需要的高质量数据提供了保证。本文提出以数据仓库为数据源,采用作业定时预先生成简化的频繁2-项集,应用存储过程执行效率高的特点,在压缩数据库大小的同时也压缩频繁i项集的大小,实现高效改进Apriori算法。
Data mining needs to have pure data and good data organization, the quality of data directly affects the effectiveness of data mining, and data warehouse extracts data from various data sources, by cleaning, integration, choice and transformation, and provides a guarantee for data mining. Based on the data warehouse, this paper improves efficiently the Apriori algorithm, firstly, by the way of Job generating the frequent 2-set, timing, and application of the characteristics implementation efficiently of the stored procedure, compressing the database and frequent i-set.
出处
《河北省科学院学报》
CAS
2008年第2期10-14,共5页
Journal of The Hebei Academy of Sciences
基金
天津市科技发展计划资助项目(04310941R)