期刊文献+

条件函数依赖及其在领域无关数据清洗中的应用 被引量:1

Conditional Functional Dependencies and Application in Domain-independent Data Cleaning
下载PDF
导出
摘要 条件函数依赖(Conditional Functional Dependeny,CFD)是对函数依赖(Functional Depencency,FD)加入语义约束扩展而来,它在数据库一致性检测、数据清洗方面更优于后者。讨论了条件函数依赖的相关概念及其基本性质,讨论如何将它应用于数据清洗,并对已提出的基于CFD的数据清洗方案提出改进措施,并通过实验说明改进措施的可行性。 Conditional functional dependency (CFDs) is an extension of functional dependency (FDs) that adding semantic con- straints, which have better performance than FD in the database consistency detection and data cleaning. This article discusses the related concepts and basic properties of CFD, also discuss how to apply CFD to data cleaning, and propose some improvements for the proposed CFD based data cleaning program, illustrates the feasibility of the improvements through the experiment.
机构地区 暨南大学
出处 《微型电脑应用》 2012年第9期23-26,30,共5页 Microcomputer Applications
关键词 数据挖掘 数据清洗 条件函数依赖 Data Mining Data Cleaning Conditional Functional Dependency
  • 相关文献

参考文献10

  • 1Eckerson, W. “Data quality and the bottom line,” [M]TDWI Report Series,Tech. Rep., 2002.
  • 2曹建军,刁兴春,汪挺,王芳潇.领域无关数据清洗研究综述[J].计算机科学,2010,37(5):26-29. 被引量:27
  • 3Erhard Rahm, Hong Hai Do. Data cleaning: problems andcurrent approaches . [M] IEEE Data Engineering Bulletin,2000,23(4):3-13.
  • 4王曰芬,章成志,张蓓蓓,吴婷婷.数据清洗研究综述[J].现代图书情报技术,2007(12):50-56. 被引量:76
  • 5Bohannon P,Fan W,Geerts F,et al, Conditional ftinc-tional dependenciesfor data cleaning[C]IEEE 23rd Inter-national Conference on Data Enginering, 2007: 746-755.
  • 6Fan W,Geerts F,Li J,et al. Discovering conditional func-tional dependencies[C].IEEE25th International Confe-rence on Data En-gineering. 2009, :683-698 .
  • 7Fan W,Geerts F’Jia X,et al. Conditional functional de-pendencies for capturing data inconsistencies[J].ACMTransactions on Data-base Systems, 2008, 33(2) :1-44 .
  • 8Bohannon P, Fan W, Geerts F, et al, Conditional func-tional dependenciesfor data cleaning[C]IEEE 23rd Inter-national Conference on Data Enginering, 2007: 746-755.
  • 9耿寅融,刘波.基于条件函数依赖的数据库一致性检测研究[J].计算机工程与应用,2012,48(3):122-125. 被引量:9
  • 10实验数据源:http://archive.ics.uci.edu/ml/datasets/Adult.

二级参考文献65

  • 1陈跃国,王京春.数据集成综述[J].计算机科学,2004,31(5):48-51. 被引量:139
  • 2王咏梅,陈家琪,耿玉良.一种可交互的数据清洗系统[J].计算机工程与设计,2005,26(4):955-957. 被引量:7
  • 3韩京宇,徐立臻,董逸生.一种大数据量的相似记录检测方法[J].计算机研究与发展,2005,42(12):2206-2212. 被引量:32
  • 4刘奕群,张敏,马少平.面向信息检索需要的网络数据清理研究[J].中文信息学报,2006,20(3):70-77. 被引量:5
  • 5陈伟,王昊,朱文明.一种提高相似重复记录检测精度的方法[J].计算机应用与软件,2006,23(10):29-30. 被引量:8
  • 6Chen Zhaoqi, Kalashnikov D V, Mehrotra S. Exploiting Rela - tionships for Object Consolidation[C]//Proceedings of the IQIS Workshop at ACM SIGMOD Conference. Baltimore, MD, 2005.
  • 7Eppler M J, Algesheimer R, Dimpfel M. Quality Criteria of Content-driven Websites and Their Influence on Customer Satisfaction and Loyalty: an Empirical Test of an Information Quality Framework[C]//8th International Conference on Information Quality (IQ 2003). November 2003:108-120.
  • 8The MIT Total Data Quality Management[OL]. http://web. mit. edu/tdqm/www/about, shtml.2009-2.
  • 9KDnuggets Polls. Data Preparation Part in Data Mining Projects[OL]. http://www.kdnuggets.com/polls/2003/data preparation. htm, Sep.-Oct. 2003.
  • 10Wang R Y. A Product Perspective on Total Data Quality Management[J].Communications of the ACM, 1998,41 (2) : 58-65.

共引文献103

同被引文献7

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部