摘要
时空聚类是数据挖掘研究的主要内容之一,在环境保护、疾病预防与控制、犯罪预防与打击等领域具有重要的应用价值。已有的时空聚类方法中,时间"距离"都认为是真实的间隔,而对于具有社会属性的案事件而言,其在不同时间尺度下具有明显的周期性特征,忽略这些特征将很难反映出案事件真实的时空规律。本文综合考虑多时间尺度下的时间属性,构建等效时空邻近域,并借鉴经典的密度聚类算法,提出了多时间尺度等效时空邻近域密度聚类算法(MTS-ESTN DBSCAN)。通过对福州市区2013年案事件数据的聚类分析表明,该方法在案事件时空聚类方面具有可行性,对于进一步深入研究城市犯罪地理具有一定的理论意义和实际价值。
Space-time clustering, which is one of the main research focuses in the field of data mining, has im- portant application values in the field of environment protection, disease prevention and control, and crime pre- vention and combat. The time "distance" is considered to be a substantial interval within the existing space-time clustering methods. However, crime cases with social attributes have obvious cyclical characteristics in different time-scales. It would be difficult to find the real rules of time and space for crime cases if these characteristics are ignored. Therefore, based on DBSCAN, an algorithm considering multiple time-scales and equivalent spatio- temporal neighborhood (MTS-ESTN DBSCAN) was put forward. In this algorithm, the various time attributes in multiple time - scales were considered, the equivalent spatio-temporal neighborhood was built, and the concept of the classical density clustering algorithm was cited. In the equivalent spatio-temporal neighborhood, the Eu- clidean distance (L2-norm) is adopted as the measurement of spatial neighborhood for the space domain. With the improved function of HDsim, which is a method used to measure the unified similarity of high dimensional data, we defined the similarity of time domain. Based on the crime cases data in the urban area of Fuzhou city during 2013, cluster analysis was conducted, and the resultant clustering quality was evaluated using several indi- cators such as CH (Calinski-Harabasz), Sil (Silhouette), DB (Davies-Bouldin) and KL (Krzanowski-Lai). The re- sults showed the feasibility of the method in space-time cluster analysis of crime cases. Compared with the tradi- tional algorithm of ST-DBSCAN, this algorithm has produced better quality of clustering. In addition, this algo- rithm can find the accumulation characteristics behind the rules of human's work, rest and other social activities in a long period. It has certain significances and application values for the advanced study of criminal geography in urban area.
出处
《地球信息科学学报》
CSCD
北大核心
2015年第7期837-845,共9页
Journal of Geo-information Science
基金
国家"863"计划重大项目课题(2012AA12A208)
关键词
时空聚类
多时间尺度
密度聚类
案事件
space-time clustering
multiple time scale
density-based clustering
crime cases