期刊文献+

一种增量发现条件函数依赖的算法 被引量:1

An incremental discovering algorithm for conditional functional dependencies
下载PDF
导出
摘要 数据库频繁更新会导致满足条件的条件函数依赖(CFDs)发生变化,为获取准确的条件函数依赖,可以在更新后的数据库上重新执行发现过程,但这种方法会导致大量时间都浪费在对原始数据集的重复处理上。针对这种情况,在CFINDER算法基础上,提出了一个增量发现条件函数依赖的算法CFUP。当数据库中增加新数据集时,CFUP在已有的CFDs的基础上,去掉不满足条件的CFDs,发现满足条件的新CFDs。实验表明,该算法能有效地进行条件函数依赖的增量式更新,与重新运行CFINDER算法相比,减少了原始数据集的扫描次数,提高了更新CFDs的效率。 If the database is updated frequently, Conditional Functional Dependencies (CFDs) that have met the conditions may changes.. In order to obtain accurate CFDs, we can rerun the discovering process over the updated database. However, it spends a lot of time on dealing with the original dataset. Aiming at this problem, based on CFINDER algorithm, the paper proposed an incremental discovering algorithm for CFDs, which is named as CFUP. When a batch of new data is added to the database, the CFUP algorithm scans dataset to decide whether existing CFDs is valid or not, and the new data pro- duces new CFDs, to achieve an incremental update for CFDs. Experiments show that the CFUP algo- rithm can effectively find CFDs by using information from the last discovering process. Compared the rerun of the CFINDER algorithm, it can reduce the scanning number to original dataset and imp the efficiency of discovering CFDs.
出处 《计算机工程与科学》 CSCD 北大核心 2013年第8期149-155,共7页 Computer Engineering & Science
基金 湖南省杰出青年基金资助项目(11JJ1011) 国家自然科学基金资助项目(61272063) 教育部新世纪人才项目(NCET-10-0140) 湖南省教育厅资助项目(09K085) 湖南省教育厅一般项目(09C401)
关键词 条件函数依赖 增量式算法 数据库 conditional functional dependencies incremental algorithm database
  • 相关文献

参考文献14

  • 1Aebi D, Perrochon L. Towards improving data quality[C]// Proc of the International Conference on Information Systems and Management of Data, 1993:273-281.
  • 2Eckerson W. Data auality and the bottom line[R]. Technical Report, TDWI Report Series, 2002.
  • 3Rahm E, Do H H. Data cleaning: Problems and current ap- proaches[J]. IEEE Data Engineering Bulletin, 2000, 23 (4) :3-13.
  • 4Huhtala Y, Kinen J, Porkka P, et al. Efficient discovery of functional and approximate dependencies using partitions [C] //Proc of the 14th International Conference on Data Engi neering, 1998:392-401.
  • 5Lopes S, Petit J-M, Lakhal L. Efficient discovery of functional dependencies and armstrong relations [C]//Proc of the 7th International Conference on Extending Database Technology: Advances in Database Technology, 2000:350-364.
  • 6Wyss C, Giannella C, Robertson E L. FastFDs: A heuristic driven, depth-first algorithm for mining functional dependen- cies from relations instances[C]//Proc of the 3rd Interna- tional Conference on Data Warehousing and Knowledge Dis- covery, 2001 : 101-110.
  • 7Bohannon P, Fan W, Geerts F, et al. Conditional functional dependencies for data cleaning [C]//Proc of the 23rd Inter- national Conference on Database Engineering, 2007: 764- 755.
  • 8胡艳丽,张维明,罗旭辉,肖卫东,汤大权.基于数据依赖的数据修复研究进展[J].计算机科学,2009,36(10):11-15. 被引量:9
  • 9胡艳丽,张维明.条件依赖理论及其应用展望[J].计算机科学,2009,36(12):115-118. 被引量:8
  • 10Golab L, Korn F, Srivastava D, et al. On generating near- optimal tableaux for conditional functional dependencies[C] //Proc of the 34th International Conference on Very Large Data Bases, 2008:376-390.

二级参考文献81

  • 1谈子敬,施伯乐.函数依赖和规范化在关系和XML间的传播[J].软件学报,2005,16(4):533-539. 被引量:18
  • 2叶舟,王东.基于规则引擎的数据清洗[J].计算机工程,2006,32(23):52-54. 被引量:18
  • 3Benge J, Jordan G M W, Smith P, et al. Global Data Management Survey: The new economy is the data economy[R]. Coopers, Price Waterhouse, 2001.
  • 4Eckerson W W. Data Quality and the bottom line: achieving busi- ness success through a commitment to highquality data. Data Warehousing Institute, 2002.
  • 5English L. Plain English on data quality : Information quality management:The next frontier[J]. DM Review Magazine, 2000.
  • 6Mullins C S. Database Administration: The Complete Guide to Practices and Procedures[M]. Addison Wesley.
  • 7Codd E F. Relational Completeness of Data Base Sublanguages [C]// Rustin R J, ed. Data Base Systems, Courant Computer Science Symposia. Vol. 6, Englewood Cliffs, N. J :PrenticeHall, 1972.
  • 8Korth,A. S. a. H. F. Database System Concepts[M]. McGrawHill,1986.
  • 9Ullman J D. Principles of Database Systems[M]. Computer Science Press, 1982.
  • 10Abiteboul S, Vianu R H V. Foundations of Databases[M]. Addison Wesley, 1995.

共引文献14

同被引文献18

  • 1金连,王宏志,黄沈滨,高宏.基于Map-Reduce的大数据缺失值填充算法[J].计算机研究与发展,2013,50(S1):312-321. 被引量:18
  • 2Bohannon P,Fan Wenfei,Geerts F,et al.Conditional functional dependencies for data cleaning[C]//Proc of IEEE ICDE.Piscataway,NJ:IEEE Press,2007:746-755.
  • 3Bravo L,Fan Wenfei,Ma Shuai.Extending dependencies with conditions[C]//Proc of the 33rd International Conference on Very Large Databases.San Francisco,CA:Morgan Kaufmann,2007:243-254.
  • 4Bravo L,Fan Wenfei,Geerts F,et al.Increasing the expressivity of conditional functional dependencies without extra complexity[C]//Proc of IEEE ICDE.Piscataway,NJ:IEEE Press,2008:516-525.
  • 5Fan Wenfei,Ma Shuai,Hu Yanli,et al.Propagating functional dependencies with conditions[J].Proceedings of the VLDB Endowment,2008,1(1):391-407.
  • 6Fan Wenfei.Dependencies revisited for improving data quality[C]//Proc of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems.New York:ACM Press,2008:159-170.
  • 7Golab L,Karloff H,Korn F,et al.Sequential dependencies[J].Proceedings of the VLDB Endowment,2009,2(1):574-585.
  • 8Koudas N,Saha A,Srivastava D,et al.Metric functional dependencies[C]//Proc of IEEE ICDE.Piscataway,NJ:IEEE Press,2009:1275-1278.
  • 9Chiang F,Miller R J.Discovering data quality rules[J].Proceedings of the VLDB Endowment,2008,1(1):1166-1177.
  • 10Fan Wenfei,Geerts F,Li Jianzhong,et al.Discovering conditional functional dependencies[J].IEEE Trans on Knowledge and Data Engineering,2011,23(5):683-698.

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部