期刊文献+

中文数据清洗研究综述 被引量:42

Survey of Chinese data cleaning
下载PDF
导出
摘要 针对中文数据清洗研究进行了综述。阐明了全面数据质量管理与数据清洗之间的关系,给出数据清洗的定义及对象;介绍中文数据清洗问题产生的背景、国内外研究现状与研究热点,并简介其基本原理、模型及已有算法;着重阐明了中文数据清洗的方法;总结中文数据清洗研究的不足,并对中文数据清洗的研究及应用进行了展望。 Chinese data cleaning problem is surveyed in this paper.The relationships among total data quality management and data cleaning are clarified,and the definition and objects of data cleaning are given.The background of data cleaning problem,research status and hot research areas are introduced,and the basic principle and some models of data cleaning are presented briefly,existing algorithms are analyzed.According to the situation of the country and demand of projects,the methods of Chinese data cleaning are emphasized.The weakness of Chinese data cleaning is clarified,and the future research topics and application related to Chinese data cleaning problem are discussed.
出处 《计算机工程与应用》 CSCD 2012年第14期121-129,共9页 Computer Engineering and Applications
基金 国家863计划重点项目(No.2007AA010305)
关键词 中文数据清洗 数据质量管理 数据集成 Chinese data cleaning data quality management data integration
  • 相关文献

参考文献42

  • 1Fan Wenfei.Extending dependencies with conditions for data cleaning[C]//8th IEEE International Conference on Computer and Information Technology, 2008 :185-190.
  • 2方幼林,杨冬青,唐世渭,张卫华,余利波,付强.数据仓库中数据质量控制研究[J].计算机工程与应用,2003,39(13):1-4. 被引量:38
  • 3Eckerson W W.Data quality and the bottom line:achiev- ing business success through a commitment to high quality data[R].The Data Warehousing Institute,2002.
  • 4English L.Plain English on data quality: information quality management:the next frontier[J].DM Review Magazine, 2000.
  • 5Eppler M J, Algesheimer R, Dimpfel M.Quality criteria of content-driven websites and their influence on cus- tomer satisfaction and loyalty: an empirical test of an information quality framework[C]//Sth International Con- ference on Information Quality(IQ 2003 ), 2003 : 108-120.
  • 6Shilakes C C'Tylman J.Enterprise information portals[Z]. 1998.
  • 7Gartner. Forecast:data quality tools,worldwide,2006- 2011 [Z].2007.
  • 8Lueebber D, Grimmer U.Systematic development of data mining based data quality tools[C]//29th VLDB,2003.
  • 9韩京宇,徐立臻,董逸生.数据质量研究综述[J].计算机科学,2008,35(2):1-5. 被引量:102
  • 10王曰芬,章成志,张蓓蓓,吴婷婷.数据清洗研究综述[J].现代图书情报技术,2007(12):50-56. 被引量:76

二级参考文献196

共引文献512

同被引文献362

引证文献42

二级引证文献281

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部