摘要
在众多数据挖掘的方法中,用来分析事物之间内在关系的挖掘方法是关联规则挖掘方法。从三个维度来描述时空大数据,这三个维度体现在时间、空间、专题方面,把三个维度的数据结合在一起,使其具有更新快速,多源、海量的综合特点。具有动态性、多维性等复杂因素的时空数据由于内在的关联关系成为数据挖掘的一个新的研究方向。时空数据包含对象、过程、事件在空间、时间、语义等方面的关联关系。目前来说,关于时空数据的关联规则挖掘主要集中于时空数据的多维性研究,然而却缺乏对时空数据自身动态增长性的研究。文中首先介绍了时空数据、数据挖掘中的关联规则算法和个性化推荐算法的相关理论。对常用的关联规则算法FP-Growth、有序树(CAN-tree)、Apriori,在详细研究的基础上,进行了系统的分析。针对时空数据挖掘中的增量问题,提出了一种基于堆有序树的时空关联规则算法。按照两步走的思路对算法进行设计:第一步操作是时空数据的赋初值操作,时空数据的空间信息及时间信息首先被提取出来,再按区域对空间数据进行划分,并按时间衰减化对时间信息进行计算,对待处理的事务各项进行扩展。第二步操作利用有序树的数据结构对上一步初始化完成后的事务各项进行关联规则的挖掘。最后一步进行实验设计,实验的对比分析主要集中在两方面,在空间划分区域的级别上与增量数量的粒度上进行,算法的有效性得以验证。
Among many data mining methods, the mining method used to analyze the internal relationship between things is the mining method of association rules. We describe the spatial-temporal big data from three dimensions, which are reflected in time, space and topic. The data of the three dimensions are combined to make it have the comprehensive characteristics of fast updating, multi-source, massive and large-scale. Spatiotemporal data with dynamic, multidimensional and other complex factors has become a new research direction of data mining because of its inherent association. Spatiotemporal data include the association of objects, processes and events in space, time, semantics and so on. At present, the association rule mining of spatiotemporal data mainly focuses on the multidimensional research of spatiotemporal data, but it lacks the research on the dynamic growth of spatiotemporal data itself. Therefore, we first introduce the spatiotemporal data, association rule algorithm and personalized recommendation algorithm in data mining. Based on the detailed study of the commonly used association rule algorithms like FP growth, CAN-tree and Apriori, we make a systematic analysis. Aiming at the incremental problem in spatiotemporal data mining, a spatiotemporal association rule algorithm based on heap ordered tree is proposed. The algorithm is designed according to the idea of two steps: the first step is the initial value operation of spatiotemporal data. The spatial information and temporal information of spatiotemporal data are first extracted, and then the spatial data are divided by region, and the time information is calculated according to time attenuation, and the transaction items to be processed are expanded. In the second step, the data structure of CAN-tree is used to mine association rules for the transaction items after initialization in the previous step. Finally, the experimental design is carried out. The comparative analysis of the experiment mainly focuses on two aspects, which are the level of spatial division and the granularity of incremental number, and the effectiveness of the algorithm is verified.
作者
杨井荣
柳军
YANG Jing-rong;LIU Jun(Engineering and Technology College of Chengdu University of Technology,Leshan 614007,China)
出处
《计算机技术与发展》
2021年第6期19-23,共5页
Computer Technology and Development
基金
四川省教育科研项目(18ZA0071)
成都理工大学工程技术学院青年科学基金(c122017018)。
关键词
时空数据
关联规则算法
堆有序树
时空数据挖掘
兴趣度
Spatiotemporal data
association rule algorithm
heap ordered tree
spatiotemporal data mining
degree of interest