期刊文献+

改进的R-树的多维数据重复检测方法

Multi dimension data duplicate detection method of improved R-tree
下载PDF
导出
摘要 针对大数据时代的高维数据重复检测的去重问题,通过借助聚类的特性,采用一种聚类更为紧凑的NSKSA构建R-树,使空间索引结构更优,降低了访问空间节点的次数。采用改进的ADDR算法提高多维数据下重复检测的效率。通过实验发现,NSKSA比DKSC、TGS算法构建R-树更为紧凑,从而使得改进的ADDR算法重复检测率比DDR提高近5%。实验结果表明,提出的NSKSA和ADDR算法能够有效地提高多维数据的重复检查率。 Aiming at the problem of duplicate detection of high-dimensional data in the era of big data,by virtue of the characteristics of clustering,an NSKSA method with more compact clustering is used to construct R-tree,which makes the spatial index structure better and reduces the number of visits to spatial nodes. The improved ADDR algorithm is used to improve the efficiency of duplicate detection under high-dimensional data. Through the experiment found that NSKSA is more compact than DKSC and TGS algorithms in constructing R-tree,so that the repeated detection rate of the improved ADDR algorithm can be improved by nearly 5% compared with DDR. Experimental results show that the proposed NSKSA and ADDR algorithm can effectively improve the repeated inspection rate of multidimensional data.
作者 贺建英 HE Jianying(School of Intelligent Manufacturing,Sichuan University of Arts and Science,Dazhou 635000,China)
出处 《电子设计工程》 2023年第3期74-80,共7页 Electronic Design Engineering
基金 四川革命老区发展研究中心重点项目(SLQ2020SA-01,SLQ2021BA-01) 四川文理学院教改项目(2020JZ016,2020JZ001)。
关键词 聚类 R-树 重复检测 高维数据 cluster R-tree duplicate detection high-dimensional data
  • 相关文献

参考文献13

二级参考文献100

共引文献148

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部