摘要
数据清洗是数据分析、数据挖掘等研究的起点。本文对数据清洗的研究进行了综述。首先阐述了数据清洗与数据质量的关系,然后说明了数据清洗的概况,并分析了数据清洗的步骤及方法,最后简要介绍了国内外关于数据清洗的研究近况,同时对中文数据清洗研究做了展望。
Data cleaning is the starting point of data analysis,data mining and so on.In this paper,the research of data cleaning is reviewed.Firstly,the relationship between data cleaning and data quality is explained,and then the data cleaning is described,and the steps and algorithms of data cleaning are analyzed,and the research situation on data cleaning at home and abroad is briefly in⁃troduced,and the research on Chinese data cleaning is a prospect.
作者
廖书妍
LIAO Shu-yan(Central China Normal University,Wuhan 430079,China)
出处
《电脑知识与技术》
2020年第20期44-47,共4页
Computer Knowledge and Technology
基金
“华中师范大学大学生创新创业训练计划项目资助”(项目编号为20190410005)。
关键词
脏数据
数据清洗
数据质量
相似重复数据
清洗步骤
dirty data
data cleaning
data quality
similar duplicate data
cleaning steps