期刊文献+

大数据的一个重要方面:数据可用性 被引量:260

An Important Aspect of Big Data:Data Usability
下载PDF
导出
摘要 随着信息技术的发展,特别是物理信息系统、互联网、云计算和社交网络等技术的突飞猛进,大数据普遍存在,正在成为信息社会的重要财富,同时也带来了巨大的挑战.数据可用性问题就是大数据的重要挑战之一.随着数据的爆炸性增长,劣质数据也随之而来,数据可用性受到严重影响,对信息社会形成严重威胁,引起了学术界和工业界的共同关注.近年来,学术界和工业界开始研究数据可用性问题,取得了一些的研究成果,但是针对大数据可用性问题的研究工作还很少.介绍了大数据可用性的基本概念,讨论大数据可用性的挑战,探讨大数据可用性方面的研究问题,并综述数据可用性方面的研究成果. With the rapid development of information technology, especially the great progresses of Internet, cyber physical system, Internet of things, cloud computing and social network, big data becomes ubiquitous. Big data brings not only great benefits but also crucial challenges. Improving the data usability is one of the most significant challenges. Dirty data accompanies the tremendous increase of data volume, degrades the data quality and data usability, and brings serious harm to the information societies. Fortunately, there has been widespread concern about the data usability in both industrial and academic communities, and the recent research efforts on data usability have yielded some impressive results. However, there are only few works focusing on the usability of big data. In this paper, the concepts of big data usability are introduced first, and then the challenges and research problems of the big data usability are discussed. Finally, the works related to the data usability are surveyed.
出处 《计算机研究与发展》 EI CSCD 北大核心 2013年第6期1147-1162,共16页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展计划基金项目(2012CB316200) 国家自然科学基金重点项目(61033015)
关键词 大数据 数据可用性 数据一致性 数据完整性 数据精确性 数据时效性 实体同一性 big data data usability data consistency data completeness data accuracy datacurrency entity identity
  • 相关文献

参考文献159

  • 1Redman T. The impact of poor data quality on the typical enterprise [J]. Communications of the ACM, 1998, 41(2) : 79-82.
  • 2Miller D W, Yeast J D, Evans R L. Missing prenatal records at a birth center: A communication problem quantified [C] // Proc of AMIA Annual Syrup Proceedings. Maryland: American Medical Informatics Association, 2005 : 535-539.
  • 3Swartz N. Gartner warns firms of 'dirty data' [J]. Information Management Journal, 2007, 41(3): 6.
  • 4Kohn L T, Corrigan J M, Donaldson M S. To Err is Human: Building a Safer Health System [M]. Washington: National Academies Press, 2000.
  • 5Eckerson W. Data Warehousing Special Report Data quality and the bottom line [R]. Applications Development Trends, 2002.
  • 6English L P. Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits [M]. New York: Wiley, 1999.
  • 7Woolsey B, Schulz M. Credit card statistics, industry facts, debt statistics [OL]. [2013-04-20 ]. http://www. creditcards, com/credit-card-news/credit-card-indust ry-facts- personal-debt-statistics-1276, php.
  • 8Shilakes C, Tylman J. Enterprise information portals [R]. New York: Merrill Lynch, 1998.
  • 9Rahm E, Do H H. Data cleaning:Problems and current approaches [J]. IEEE Data Engineering Bulletin, 2000, 23 (4): 3-13.
  • 10Dong X L, Berti-Equille L, Srivastava D. Integrating conflicting data:The role of source dependence[J]. Proceedings of the VLDB Endowment, 2009, 2(1): 550-561.

同被引文献1873

引证文献260

二级引证文献2744

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部