期刊文献+

数据库中多源异构异常数据清洗方法 被引量:1

A Method for Cleansing Heterogeneous Abnormal Data from Multiple Sources in a Database
下载PDF
导出
摘要 常规的数据库中多源异构异常数据清洗,主要采用数据特征相似度值计算的方法进行,忽略了数据特征时序关联性对异常数据识别的影响,导致异常数据清洗结果的查全率较低.因此,提出基于时序关联和密度聚类算法的数据库中多源异构异常数据清洗方法,对数据库中多源异构数据进行去噪、归一化的预处理,计算处理后的数据特征时序关联度,空间反馈后,基于密度聚类算法对数据聚类密度进行计算,识别出异常数据,求解异常数据缺失部分并填补,完成异常数据的清洗.实验结果表明:应用所提方法得出的异常数据清洗结果,表现出的查全率较高,均值可达0.94,可靠性较高,满足了数据库中多源异构异常数据清洗的现实需求. In the conventional data cleansing of multi-source heterogeneous abnormal data in the database,the method of calculating the similarity value of data characteristics is mainly used to clean the abnormal data,ignoring the impact of the temporal correlation of data characteristics on the identification of abnormal data,resulting in a low recall rate of abnormal data cleansing results.Therefore,a data cleansing method based on temporal correlation and density clustering algorithm for multi-source heterogeneous abnormal data in database is proposed,aiming at preprocessing for denoising and normalization of multisource heterogeneous data in the database,calculating the temporal correlation of processed data features,and providing spatial feedback.Based on the density clustering algorithm,the data clustering density is calculated to identify abnormal data,solve the missing parts of abnormal data,and fill them in,completing the cleansing of abnormal data.The experimental results show that the abnormal data cleansing results obtained after the application of the proposed method show a high recall rate,and the average value can reach 0.94,and the reliability is high,which meets the practical needs of multi-source heterogeneous abnormal data cleansing in the database.
作者 王彩霞 陶健 WANG Caixia;TAO Jian(Anhui Business College of Vocational Technology,Wuhu 241002,China)
出处 《通化师范学院学报》 2023年第12期54-60,共7页 Journal of Tonghua Normal University
基金 安徽省科研编制计划项目(2022AH052741)。
关键词 多源异构数据 异常数据清洗 数据库 数据清洗 时序关联 密度聚类算法 multi-source heterogeneous data abnormal data cleansing database data cleansing temporal correlation density clustering algorithm
  • 相关文献

参考文献15

二级参考文献113

共引文献50

同被引文献7

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部