摘要
空间co-location模式是指在空间中相互邻近且频繁出现的空间特征的集合。由于传统的co-location模式挖掘使用单一的距离阈值来定义空间邻近关系,忽略了距离变化对空间邻近关系带来的影响,并且最小频繁度阈值的设定对于没有数据相关专业知识的用户来说存在一定的困难。针对上述问题,该文提出了一种基于模糊理论和d-网格的邻近隶属度计算方法,该方法可以避免计算Euclid距离并且可以利用d-网格快速找到满足模糊邻近关系的极大团,然后结合Top-k思想,挖掘出频繁度最大的k个空间co-location模式。实验结果表明:该方法具有更高效的性能和更细致的计算结果,并且通过比较召回率,发现该方法得到的频繁度最大的k个模式与传统co-location模式挖掘算法得到的频繁度最大的k个模式大部分相同,说明提出的模糊度量和挖掘算法具有较大的实用价值。
A spatial co-location pattern is a set of spatial features that are frequently observed together in space. Traditional co-location pattern mining uses a single distance threshold to define neighbor relationships while ignoring the impact of distance differences, but the minimum prevalence threshold is difficult to determine for inexperienced users. This paper presents a method for calculating the neighborhood membership degree based on fuzzy theory and d-grids. This method does not calculate the Euclidean distance and quickly finds the maximal cliques that satisfy the fuzzy neighborhood relationship by using the d-grid. The results was then combined with the Top-k algorithm to find the k most prevalent co-location patterns. Tests show that this method is more efficient and gives more detailed results. The recall rate shows that the k most prevalent patterns obtained by this method agree well with those obtained by the traditional co-location pattern mining algorithm, which shows the effectiveness of this fuzzy measurement and mining algorithm.
作者
李钧毅
王丽珍
陈红梅
LI Junyi;WANG Lizhen;CHEN Hongmei(School of Information Science and Engineering,Yunnan University,Kunming 650500,China)
出处
《清华大学学报(自然科学版)》
CSCD
北大核心
2021年第9期943-952,共10页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金项目(61966036,61662086)
云南省创新团队项目(2018HC019)。