摘要
当前常用的劣质数据动态清理方法规模大,需对其进行剪枝处理后,才可应用于劣质数据动态清理中,不仅效率低;且清理结果不准确。因此,提出一种新的嵌入式实时系统中劣质数据动态清理方法。劣质数据主要包括错误数据、重复数据和不完整数据,利用统计学求期望方法对错误进行清理,计算得到一个可信区间的基准范围,依据该基准范围对错误数据进行清理。利用编辑距离获取两个字符串之间的相似度,通过得到的相似度对重复数据进行动态清理。对嵌入式实时系统数据库中所有记录的不完整性进行评估,依据评估结果决定是否清除相应数据。实验结果表明,所提方法针对劣质数据有很高的清理准确性。
The current size of the inferior data dynamic c o m m o n l y used cleaning m e t h o d,need to prune itstreatment before they can be used in poor dynamic data cleaning,not only the results are not accurate. Therefore,the poor dynamic data cleaning metliod is a n e w e m b e d d e d real-time system,the inferior data mainly includes error data,repeated data and incomplete data,using statisticaup the expectation error,calculated on the basis of a range of confidence intervals,on the basis of the referencerange of error data cleaning. Using the edit distance to obtain the similarity betwtained through the similarity of dynamic cleaning. T h e integrity of all records in the e m b e d d e d real-time system da-tabase is evaluated,and the corresjonding data are determined according to the evaluation results. T h e experimen-tal results show that tiie proposed metiiod has a high accuracy for poor quality data.
出处
《科学技术与工程》
北大核心
2017年第28期234-239,共6页
Science Technology and Engineering
基金
2016年度河南省科技攻关(162102210361)
河南省高等学校重点科研项目(16B510003)
黄淮学院青年教师科研能力提升项目(201512711)资助
关键词
嵌入式实时系统
劣质数据
动态清理
embedded real-time system poor quality data dynamic cleaning