摘要
在互联网时代,数据成为了新的生产要素,也成为了基础性资源和战略性资源,同时还是重要的生产力。大数据服务业在全国广泛开展,数据交易所纷纷成立。这时,数据质量就逐渐变成制约数据产业发展的关键问题。首先,按照时间顺序将数据质量的研究内容划分为3个阶段,全面梳理和总结每个阶段的代表性成果,包括理论、方法、技术、工具和框架;然后,分析了在物联网、云计算和大数据环境下,数据质量研究所面临的各种挑战和机遇;最后,从数据质量模型、大数据质量管理、大数据质量相关技术、众包、物联网以及数据开放6个方面对数据质量的研究热点和发展方向进行了展望。
In the Internet age,data becomes new factors of production,becomes the basic resources and strategic resources,and are important productive forces.Big data services have been widely carried out in China,and data exchanges have been established.Now,data quality has become a key issue restricting the development of data industry.This paper divided the research issues of data quality into three stages according to the chronological order,and summarized the representative results of each stage,including methodologies,techniques,models,tools and frameworks.Then,it analyzed the challenges and opportunities faced by data quality research in the new environment of big data,the internet of things and cloud computing.Finally,it prospected research focuses and development trend of data quality from six aspects:data quality model,quality management of big data,related quality techniques for big data,crowdsourcing,internet of things and data sharing.
作者
蔡莉
梁宇
朱扬勇
何婧
CAI Li;LIANG Yu;ZHU Yang-yong;HE Jing(School of Computer Science,Fudan University,Shanghai 200433,China;Shanghai Key Laboratory of Data Science,Shanghai 201203,China;School of Software,Yunnan University,Kunming 650091,China)
出处
《计算机科学》
CSCD
北大核心
2018年第4期1-10,共10页
Computer Science
基金
国家自然科学基金:基于位置大数据的城市热点区域和居民出行模式的挖掘研究(61663047)资助