The differentiation of urban residential space is a key and hot topic in urban research, which has very important theoretical significance for urban development and residential choice. In this paper, web crawler techn...The differentiation of urban residential space is a key and hot topic in urban research, which has very important theoretical significance for urban development and residential choice. In this paper, web crawler technology is used to collect urban big data. Using spatial analysis and clustering, the differentiation law of residential space in the main urban area of Wuhan is revealed. The residential differentiation is divided into five types: "Garden" community, "Guozi" community, "Wangjiangshan" community, "Yashe" community, and "Shuxin" community. The "Garden" community is aimed at the elderly, with good medical accessibility and open space around the community. The "Guozi Community" is aimed at young people, and the community has accessibility to good educational and commercial facilities. The "Wangjiangshan" community is oriented towards the social elite group, with beautiful natural living environment, close to the city core, and convenient transportation. The "Yashe" community is aimed at the general income group, and its location is characterized by being adjacent to commercial districts and convenient transportation. The "Shuxin" community is aimed at the middle and lower income groups, far from the city center, and the living environment quality is not high.展开更多
This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different cluster...This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different clustering themes. The research shows that some valuable spatial correlation patterns can be further found from the clustering result comparison with multi-themes, based on traditional spatial clustering as the first step. Those patterns can tell us what relations those themes have, and thus will help us have a deeper understanding of the studied spatial entities. An example is also given to demonstrate the principle and process of the method.展开更多
For the charging station construction of electric vehicle,location selecting is a key issue.There are two problems in location selection of the electric vehicle charging station.One is determining the location of char...For the charging station construction of electric vehicle,location selecting is a key issue.There are two problems in location selection of the electric vehicle charging station.One is determining the location of charging station;the other is evaluating the location of charging station.To determine the charging station location,an spatial clustering algorithm is proposed and programmed.The example simulation shows the effectiveness of the spatial clustering algorithm.To evaluate the charging station location,a multi-hierarchical fuzzy method is proposed.Based on the location factors of electric vehicle charging station,the hierarchical evaluation structure of electric vehicle charging station location is constructed,including three levels,4first-class factors and 14second-class factors.The fuzzy multi-hierarchical evaluation model and algorithm are built.The analysis results show that the multi-hierarchical fuzzy method can reasonably complete the electric vehicle charging station location evaluation.展开更多
Exploratory data analysis is increasingly more necessary as larger spatial data is managed in electro-magnetic media. Spatial clustering is one of the very important spatial data mining techniques which is the discove...Exploratory data analysis is increasingly more necessary as larger spatial data is managed in electro-magnetic media. Spatial clustering is one of the very important spatial data mining techniques which is the discovery of interesting rela-tionships and characteristics that may exist implicitly in spatial databases. So far, a lot of spatial clustering algorithms have been proposed in many applications such as pattern recognition, data analysis, and image processing and so forth. However most of the well-known clustering algorithms have some drawbacks which will be presented later when ap-plied in large spatial databases. To overcome these limitations, in this paper we propose a robust spatial clustering algorithm named NSCABDT (Novel Spatial Clustering Algorithm Based on Delaunay Triangulation). Delaunay dia-gram is used for determining neighborhoods based on the neighborhood notion, spatial association rules and colloca-tions being defined. NSCABDT demonstrates several important advantages over the previous works. Firstly, it even discovers arbitrary shape of cluster distribution. Secondly, in order to execute NSCABDT, we do not need to know any priori nature of distribution. Third, like DBSCAN, Experiments show that NSCABDT does not require so much CPU processing time. Finally it handles efficiently outliers.展开更多
With the advancement in geospatial data acquisition technology, large sizes of digital data are being collected for our world. These include air- and space-borne imagery, LiDAR data, sonar data, terrestrial laser-scan...With the advancement in geospatial data acquisition technology, large sizes of digital data are being collected for our world. These include air- and space-borne imagery, LiDAR data, sonar data, terrestrial laser-scanning data, etc. LiDAR sensors generate huge datasets of point of multiple returns. Because of its large size, LiDAR data has costly storage and computational requirements. In this article, a LiDAR compression method based on spatial clustering and optimal filtering is presented. The method consists of classification and spatial clustering of the study area image and creation of the optimal planes in the LiDAR dataset through first-order plane-fitting. First-order plane-fitting is equivalent to the Eigen value problem of the covariance matrix. The Eigen value of the covariance matrix represents the spatial variation along the direction of the corresponding eigenvector. The eigenvector of the minimum Eigen value is the estimated normal vector of the surface formed by the LiDAR point and its neighbors. The ratio of the minimum Eigen value and the sum of the Eigen values approximates the change of local curvature, which determines the deviation of the surface formed by a LiDAR point and its neighbors from the tangential plane formed at that neighborhood. If the minimum Eigen value is close to zero for example, then the surface consisting of the point and its neighbors is a plane. The objective of this ongoing research work is basically to develop a LiDAR compression method that can be used in the future at the data acquisition phase to help remove fake returns and redundant points.展开更多
Rail transit plays a crucial role in improving urban sustainability and livability.In many Chinese cities,the planning of rail transit routes and stations is focused on facilitating new developments rather than revita...Rail transit plays a crucial role in improving urban sustainability and livability.In many Chinese cities,the planning of rail transit routes and stations is focused on facilitating new developments rather than revitalizing existing built-up areas.This approach reflects the local governments’expectations of substantial growth to reshape the urban structure.However,existing research on transit-oriented development(TOD)rarely explores the spatial interactions between individual transit stations and investigates how they can be integrated to achieve synergistic effects and balanced development.This study proposes that rail transit systems impact urban structure through two“forces”:the provision of additional and reliable carrying capacity and the reduction of travel time between locations.Metro passenger flow is used as a proxy for these forces,and community detection techniques are employed to identify the actual and optimal spatial clusters in Wuhan,China.The results reveal that the planned sub-centers align reasonably well with the optimal spatial clusters in terms of spatial configuration.However,the actual spatial clusters tend to have longer internal travel times compared to the optimal clusters.Further exploration suggests the need for equalizing land use density within planned spatial clusters served by the metro system.Additionally,promoting concentrated,differentiated,and mixed functional arrangements in metro station areas with low passenger flows within the planned clusters could be beneficial.This paper presents a new framework for investigating urban spatial clusters influenced by a metro system.展开更多
There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteri...There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.展开更多
Background:Tuberculosis(TB)is the notifiable infectious disease with the second highest incidence in the Qinghai province,a province with poor primary health care infrastructure.Understanding the spatial distribution ...Background:Tuberculosis(TB)is the notifiable infectious disease with the second highest incidence in the Qinghai province,a province with poor primary health care infrastructure.Understanding the spatial distribution of TB and related environmental factors is necessary for developing effective strategies to control and further eliminate TB.Methods:Our TB incidence data and meteorological data were extracted from the China Information System of Disease Control and Prevention and statistical yearbooks,respectively.We calculated the global and local Moran’s I by using spatial autocorrelation analysis to detect the spatial clustering of TB incidence each year.A spatial panel data model was applied to examine the associations of meteorological factors with TB incidence after adjustment of spatial individual effects and spatial autocorrelation.Results:The Local Moran’s I method detected 11 counties with a significantly high-high spatial clustering(average annual incidence:294/100000)and 17 counties with a significantly low-low spatial clustering(average annual incidence:68/100000)of TB annual incidence within the examined five-year period;the global Moran’s I values ranged from 0.40 to 0.58(all P-values<0.05).The TB incidence was positively associated with the temperature,precipitation,and wind speed(all P-values<0.05),which were confirmed by the spatial panel data model.Each 10°C,2 cm,and 1 m/s increase in temperature,precipitation,and wind speed associated with 9%and 3%decrements and a 7%increment in the TB incidence,respectively.Conclusions:High TB incidence areas were mainly concentrated in south-western Qinghai,while low TB incidence areas clustered in eastern and north-western Qinghai.Areas with low temperature and precipitation and with strong wind speeds tended to have higher TB incidences.展开更多
Traditional spatial clustering methods have the disadvantage of "hardware division", and can not describe the physical characteristics of spatial entity effectively. In view of the above, this paper sets forth a gen...Traditional spatial clustering methods have the disadvantage of "hardware division", and can not describe the physical characteristics of spatial entity effectively. In view of the above, this paper sets forth a general multi-dimensional cloud model, which describes the characteristics of spatial objects more reasonably according to the idea of non-homogeneous and non-symmetry. Based on infrastructures' classification and demarcation in Zhanjiang, a detailed interpretation of clustering results is made from the spatial distribution of membership degree of clustering, the comparative study of Fuzzy C-means and a coupled analysis of residential land prices. General multi-dimensional cloud model reflects the integrated char- acteristics of spatial objects better, reveals the spatial distribution of potential information, and realizes spatial division more accurately in complex circumstances. However, due to the complexity of spatial interactions between geographical entities, the generation of cloud model is a specific and challenging task.展开更多
We examined spatially clustered distribution of jumbo flying squid(Dosidicus gigas) in the offshore waters of Peru bounded by 78?–86?W and 8?–20?S under 0.5?×0.5? fishing grid. The study is based on the catch-p...We examined spatially clustered distribution of jumbo flying squid(Dosidicus gigas) in the offshore waters of Peru bounded by 78?–86?W and 8?–20?S under 0.5?×0.5? fishing grid. The study is based on the catch-per-unit-effort(CPUE) and fishing effort from Chinese mainland squid jigging fleet in 2003–2004 and 2006–2013. The data for all years as well as the eight years(excluding El Ni?o events) were studied to examine the effect of climate variation on the spatial distribution of D. gigas. Five spatial clusters reflecting the spatial distribution were computed using K-means and Getis-Ord Gi* for a detailed comparative study. Our results showed that clusters identified by the two methods were quite different in terms of their spatial patterns, and K-means was not as accurate as Getis-Ord Gi*, as inferred from the agreement degree and receiver operating characteristic. There were more areas of hot and cold spots in years without the impact of El Ni?o, suggesting that such large-scale climate variations could reduce the clustering level of D. gigas. The catches also showed that warm El Ni?o conditions and high water temperature were less favorable for D. gigas offshore Peru. The results suggested that the use of K-means is preferable if the aim is to discover the spatial distribution of each sub-region(cluster) of the study area, while Getis-Ord Gi* is preferable if the aim is to identify statistically significant hot spots that may indicate the central fishing ground.展开更多
Spatial clustering is widely used in many fields such as WSN (Wireless Sensor Networks), web clustering, remote sensing and so on for discovery groups and to identify interesting distributions in the underlying databa...Spatial clustering is widely used in many fields such as WSN (Wireless Sensor Networks), web clustering, remote sensing and so on for discovery groups and to identify interesting distributions in the underlying database. By discussing the relationships between the optimal clustering and the initial seeds, a clustering validity index and the principle of seeking initial seeds were proposed, and on this principle we recommend an initial seed-seeking strategy: SSPG (Single-Shortest-Path Graph). With SSPG strategy used in clustering algorithms, we find that the result of clustering is optimized with more probability. At the end of the paper, according to the combinational theory of optimization, a method is proposed to obtain optimal reference k value of cluster number, and is proven to be efficient.展开更多
As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural sp...As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.展开更多
As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms ...As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise(DBSCAN).Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data.In this paper,we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm.The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm.To solve this problem,we developed the Genetic algorithm-based Keyword Matching DBSCAN(GKM-DBSCAN)algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword.In order to improve the performance of GKM-DBSCAN,we improved the general genetic algorithm by performing a genetic operation in groups.We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method.The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times.GKM-DBSCAN with the index number binarization(INB)is 1.8 times faster than GKM-DBSCAN with the cluster number binarization(CNB).展开更多
HFMD can be caused by a variety of enteroviruses,including Coxsackievirus A16 and enterovirus71.There are no effective therapeutic measures to cure HFMD at present.So,this study aimed to analyze the spatial relativity...HFMD can be caused by a variety of enteroviruses,including Coxsackievirus A16 and enterovirus71.There are no effective therapeutic measures to cure HFMD at present.So,this study aimed to analyze the spatial relativity and the local accumulation type based on the theory of spatial analysis and the spatial autocorrelation analysis module of ArcGIS and Geo Da.We found that there was a seasonal trend in HFMD.The lowest incidence appeared in February,and the peak of the reported incidence was occurred during the period from May to June.However,in most cases,another peak appeared from September to November.The trend of incidence was related to age,too.The overall trend of the reported incidence was a U-shape in north-south orientation and exposed an inverted U-shape in east-west.The correlation between the spatial distribution of HFMD was positive.Hunan,Guangxi and Guangdong were the hot areas,while the cold spots were Jilin,Inner Mongolia,Xinjiang,Gansu and Qinghai.展开更多
Earthquakes exhibit clear clustering on the earth. It is important to explore the spatial-temporal characteristics of seismicity clusters and their spatial heterogeneity. We analyze effects of plate space, tectonic st...Earthquakes exhibit clear clustering on the earth. It is important to explore the spatial-temporal characteristics of seismicity clusters and their spatial heterogeneity. We analyze effects of plate space, tectonic style, and their interaction on characteristic of cluster.Based on data of earthquakes not less than moment magnitude(M_w) 5.6 from 1960 to 2014, this study used the spatial-temporal scan method to identify earthquake clusters. The results indicate that seismic spatial-temporal clusters can be classified into two types based on duration: persistent clusters and burst clusters. Finally, we analysed the spatial heterogeneity of the two types. The main conclusions are as follows: 1) Ninety percent of the persistent clusters last for 22-38 yr and show a high clustering likelihood;ninety percent of the burst clusters last for 1-1.78 yr and show a high relative risk. 2) The persistent clusters are mainly distributed in interplate zones, especially along the western margin of the Pacific Ocean. The burst clusters are distributed in both intraplate and interplate zones, slightly concentrated in the India-Eurasia interaction zone. 3) For the persistent type, plate interaction plays an important role in the distribution of the clusters’ likelihood and relative risk. In addition, the tectonic style further enhances the spatial heterogeneity. 4) For the burst type,neither plate activity nor tectonic style has an obvious effect on the distribution of the clusters’ likelihood and relative risk. Nevertheless,interaction between these two spatial factors enhances the spatial heterogeneity, especially in terms of relative risk.展开更多
Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni...Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.展开更多
A shared nothing spatial database cluster is system that provides continuous service even if some system failure happens in any node. So, an efficient recovery of system failure is very important. Generally, the exist...A shared nothing spatial database cluster is system that provides continuous service even if some system failure happens in any node. So, an efficient recovery of system failure is very important. Generally, the existing method recovers the failed node by using both cluster log and local log. This method, however, cause several problems that increase communication cost and size of cluster log. This paper proposes novel recovery method using recently updated record information in shared nothing spatial database cluster. The proposed technique utilizes update information of records and pointers of actual data. This makes a reduction of log size and communication cost. Consequently, this reduces recovery time of failed node due to less processing of update operations.展开更多
内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中...内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。展开更多
针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行...针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。展开更多
文摘The differentiation of urban residential space is a key and hot topic in urban research, which has very important theoretical significance for urban development and residential choice. In this paper, web crawler technology is used to collect urban big data. Using spatial analysis and clustering, the differentiation law of residential space in the main urban area of Wuhan is revealed. The residential differentiation is divided into five types: "Garden" community, "Guozi" community, "Wangjiangshan" community, "Yashe" community, and "Shuxin" community. The "Garden" community is aimed at the elderly, with good medical accessibility and open space around the community. The "Guozi Community" is aimed at young people, and the community has accessibility to good educational and commercial facilities. The "Wangjiangshan" community is oriented towards the social elite group, with beautiful natural living environment, close to the city core, and convenient transportation. The "Yashe" community is aimed at the general income group, and its location is characterized by being adjacent to commercial districts and convenient transportation. The "Shuxin" community is aimed at the middle and lower income groups, far from the city center, and the living environment quality is not high.
文摘This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different clustering themes. The research shows that some valuable spatial correlation patterns can be further found from the clustering result comparison with multi-themes, based on traditional spatial clustering as the first step. Those patterns can tell us what relations those themes have, and thus will help us have a deeper understanding of the studied spatial entities. An example is also given to demonstrate the principle and process of the method.
基金supported by the National Natural Science Foundation of China(No.51575047)
文摘For the charging station construction of electric vehicle,location selecting is a key issue.There are two problems in location selection of the electric vehicle charging station.One is determining the location of charging station;the other is evaluating the location of charging station.To determine the charging station location,an spatial clustering algorithm is proposed and programmed.The example simulation shows the effectiveness of the spatial clustering algorithm.To evaluate the charging station location,a multi-hierarchical fuzzy method is proposed.Based on the location factors of electric vehicle charging station,the hierarchical evaluation structure of electric vehicle charging station location is constructed,including three levels,4first-class factors and 14second-class factors.The fuzzy multi-hierarchical evaluation model and algorithm are built.The analysis results show that the multi-hierarchical fuzzy method can reasonably complete the electric vehicle charging station location evaluation.
文摘Exploratory data analysis is increasingly more necessary as larger spatial data is managed in electro-magnetic media. Spatial clustering is one of the very important spatial data mining techniques which is the discovery of interesting rela-tionships and characteristics that may exist implicitly in spatial databases. So far, a lot of spatial clustering algorithms have been proposed in many applications such as pattern recognition, data analysis, and image processing and so forth. However most of the well-known clustering algorithms have some drawbacks which will be presented later when ap-plied in large spatial databases. To overcome these limitations, in this paper we propose a robust spatial clustering algorithm named NSCABDT (Novel Spatial Clustering Algorithm Based on Delaunay Triangulation). Delaunay dia-gram is used for determining neighborhoods based on the neighborhood notion, spatial association rules and colloca-tions being defined. NSCABDT demonstrates several important advantages over the previous works. Firstly, it even discovers arbitrary shape of cluster distribution. Secondly, in order to execute NSCABDT, we do not need to know any priori nature of distribution. Third, like DBSCAN, Experiments show that NSCABDT does not require so much CPU processing time. Finally it handles efficiently outliers.
文摘With the advancement in geospatial data acquisition technology, large sizes of digital data are being collected for our world. These include air- and space-borne imagery, LiDAR data, sonar data, terrestrial laser-scanning data, etc. LiDAR sensors generate huge datasets of point of multiple returns. Because of its large size, LiDAR data has costly storage and computational requirements. In this article, a LiDAR compression method based on spatial clustering and optimal filtering is presented. The method consists of classification and spatial clustering of the study area image and creation of the optimal planes in the LiDAR dataset through first-order plane-fitting. First-order plane-fitting is equivalent to the Eigen value problem of the covariance matrix. The Eigen value of the covariance matrix represents the spatial variation along the direction of the corresponding eigenvector. The eigenvector of the minimum Eigen value is the estimated normal vector of the surface formed by the LiDAR point and its neighbors. The ratio of the minimum Eigen value and the sum of the Eigen values approximates the change of local curvature, which determines the deviation of the surface formed by a LiDAR point and its neighbors from the tangential plane formed at that neighborhood. If the minimum Eigen value is close to zero for example, then the surface consisting of the point and its neighbors is a plane. The objective of this ongoing research work is basically to develop a LiDAR compression method that can be used in the future at the data acquisition phase to help remove fake returns and redundant points.
文摘Rail transit plays a crucial role in improving urban sustainability and livability.In many Chinese cities,the planning of rail transit routes and stations is focused on facilitating new developments rather than revitalizing existing built-up areas.This approach reflects the local governments’expectations of substantial growth to reshape the urban structure.However,existing research on transit-oriented development(TOD)rarely explores the spatial interactions between individual transit stations and investigates how they can be integrated to achieve synergistic effects and balanced development.This study proposes that rail transit systems impact urban structure through two“forces”:the provision of additional and reliable carrying capacity and the reduction of travel time between locations.Metro passenger flow is used as a proxy for these forces,and community detection techniques are employed to identify the actual and optimal spatial clusters in Wuhan,China.The results reveal that the planned sub-centers align reasonably well with the optimal spatial clusters in terms of spatial configuration.However,the actual spatial clusters tend to have longer internal travel times compared to the optimal clusters.Further exploration suggests the need for equalizing land use density within planned spatial clusters served by the metro system.Additionally,promoting concentrated,differentiated,and mixed functional arrangements in metro station areas with low passenger flows within the planned clusters could be beneficial.This paper presents a new framework for investigating urban spatial clusters influenced by a metro system.
基金Under the auspices of National Social Science Foundation of China (No.21BJY202)。
文摘There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.
基金This study was supported by the Qinghai Center for Disease Control and Prevention(CDC).
文摘Background:Tuberculosis(TB)is the notifiable infectious disease with the second highest incidence in the Qinghai province,a province with poor primary health care infrastructure.Understanding the spatial distribution of TB and related environmental factors is necessary for developing effective strategies to control and further eliminate TB.Methods:Our TB incidence data and meteorological data were extracted from the China Information System of Disease Control and Prevention and statistical yearbooks,respectively.We calculated the global and local Moran’s I by using spatial autocorrelation analysis to detect the spatial clustering of TB incidence each year.A spatial panel data model was applied to examine the associations of meteorological factors with TB incidence after adjustment of spatial individual effects and spatial autocorrelation.Results:The Local Moran’s I method detected 11 counties with a significantly high-high spatial clustering(average annual incidence:294/100000)and 17 counties with a significantly low-low spatial clustering(average annual incidence:68/100000)of TB annual incidence within the examined five-year period;the global Moran’s I values ranged from 0.40 to 0.58(all P-values<0.05).The TB incidence was positively associated with the temperature,precipitation,and wind speed(all P-values<0.05),which were confirmed by the spatial panel data model.Each 10°C,2 cm,and 1 m/s increase in temperature,precipitation,and wind speed associated with 9%and 3%decrements and a 7%increment in the TB incidence,respectively.Conclusions:High TB incidence areas were mainly concentrated in south-western Qinghai,while low TB incidence areas clustered in eastern and north-western Qinghai.Areas with low temperature and precipitation and with strong wind speeds tended to have higher TB incidences.
基金National Natural Science Foundation of China, N0.40971102 Knowledge Innovation Project of the Chinese Academy of Sciences, No. KZCX2-YW-322 Special Grant for Postgraduates' Scientific Innovation and So- cial Practice in 2008
文摘Traditional spatial clustering methods have the disadvantage of "hardware division", and can not describe the physical characteristics of spatial entity effectively. In view of the above, this paper sets forth a general multi-dimensional cloud model, which describes the characteristics of spatial objects more reasonably according to the idea of non-homogeneous and non-symmetry. Based on infrastructures' classification and demarcation in Zhanjiang, a detailed interpretation of clustering results is made from the spatial distribution of membership degree of clustering, the comparative study of Fuzzy C-means and a coupled analysis of residential land prices. General multi-dimensional cloud model reflects the integrated char- acteristics of spatial objects better, reveals the spatial distribution of potential information, and realizes spatial division more accurately in complex circumstances. However, due to the complexity of spatial interactions between geographical entities, the generation of cloud model is a specific and challenging task.
基金supported by the National Natural Science Foundation of China(41406146 and 41476129)Shanghai Universities First-class Disciplines Project Fisheries(A)
文摘We examined spatially clustered distribution of jumbo flying squid(Dosidicus gigas) in the offshore waters of Peru bounded by 78?–86?W and 8?–20?S under 0.5?×0.5? fishing grid. The study is based on the catch-per-unit-effort(CPUE) and fishing effort from Chinese mainland squid jigging fleet in 2003–2004 and 2006–2013. The data for all years as well as the eight years(excluding El Ni?o events) were studied to examine the effect of climate variation on the spatial distribution of D. gigas. Five spatial clusters reflecting the spatial distribution were computed using K-means and Getis-Ord Gi* for a detailed comparative study. Our results showed that clusters identified by the two methods were quite different in terms of their spatial patterns, and K-means was not as accurate as Getis-Ord Gi*, as inferred from the agreement degree and receiver operating characteristic. There were more areas of hot and cold spots in years without the impact of El Ni?o, suggesting that such large-scale climate variations could reduce the clustering level of D. gigas. The catches also showed that warm El Ni?o conditions and high water temperature were less favorable for D. gigas offshore Peru. The results suggested that the use of K-means is preferable if the aim is to discover the spatial distribution of each sub-region(cluster) of the study area, while Getis-Ord Gi* is preferable if the aim is to identify statistically significant hot spots that may indicate the central fishing ground.
基金Supported by the National Natural Science Foundation of China (No.60502028, No. 90204008).
文摘Spatial clustering is widely used in many fields such as WSN (Wireless Sensor Networks), web clustering, remote sensing and so on for discovery groups and to identify interesting distributions in the underlying database. By discussing the relationships between the optimal clustering and the initial seeds, a clustering validity index and the principle of seeking initial seeds were proposed, and on this principle we recommend an initial seed-seeking strategy: SSPG (Single-Shortest-Path Graph). With SSPG strategy used in clustering algorithms, we find that the result of clustering is optimized with more probability. At the end of the paper, according to the combinational theory of optimization, a method is proposed to obtain optimal reference k value of cluster number, and is proven to be efficient.
基金Under the auspices of National Natural Science Foundation of China(No.41271179)。
文摘As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korea government (MSIT) (No.2021R1F1A1049387).
文摘As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise(DBSCAN).Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data.In this paper,we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm.The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm.To solve this problem,we developed the Genetic algorithm-based Keyword Matching DBSCAN(GKM-DBSCAN)algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword.In order to improve the performance of GKM-DBSCAN,we improved the general genetic algorithm by performing a genetic operation in groups.We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method.The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times.GKM-DBSCAN with the index number binarization(INB)is 1.8 times faster than GKM-DBSCAN with the cluster number binarization(CNB).
基金the National Natural Social Science Found of China(Grant Nos.17AJY008)
文摘HFMD can be caused by a variety of enteroviruses,including Coxsackievirus A16 and enterovirus71.There are no effective therapeutic measures to cure HFMD at present.So,this study aimed to analyze the spatial relativity and the local accumulation type based on the theory of spatial analysis and the spatial autocorrelation analysis module of ArcGIS and Geo Da.We found that there was a seasonal trend in HFMD.The lowest incidence appeared in February,and the peak of the reported incidence was occurred during the period from May to June.However,in most cases,another peak appeared from September to November.The trend of incidence was related to age,too.The overall trend of the reported incidence was a U-shape in north-south orientation and exposed an inverted U-shape in east-west.The correlation between the spatial distribution of HFMD was positive.Hunan,Guangxi and Guangdong were the hot areas,while the cold spots were Jilin,Inner Mongolia,Xinjiang,Gansu and Qinghai.
基金Under the auspices of National Natural Science Foundation of China(No.41771537)Fundamental Research Funds for the Central Universities
文摘Earthquakes exhibit clear clustering on the earth. It is important to explore the spatial-temporal characteristics of seismicity clusters and their spatial heterogeneity. We analyze effects of plate space, tectonic style, and their interaction on characteristic of cluster.Based on data of earthquakes not less than moment magnitude(M_w) 5.6 from 1960 to 2014, this study used the spatial-temporal scan method to identify earthquake clusters. The results indicate that seismic spatial-temporal clusters can be classified into two types based on duration: persistent clusters and burst clusters. Finally, we analysed the spatial heterogeneity of the two types. The main conclusions are as follows: 1) Ninety percent of the persistent clusters last for 22-38 yr and show a high clustering likelihood;ninety percent of the burst clusters last for 1-1.78 yr and show a high relative risk. 2) The persistent clusters are mainly distributed in interplate zones, especially along the western margin of the Pacific Ocean. The burst clusters are distributed in both intraplate and interplate zones, slightly concentrated in the India-Eurasia interaction zone. 3) For the persistent type, plate interaction plays an important role in the distribution of the clusters’ likelihood and relative risk. In addition, the tectonic style further enhances the spatial heterogeneity. 4) For the burst type,neither plate activity nor tectonic style has an obvious effect on the distribution of the clusters’ likelihood and relative risk. Nevertheless,interaction between these two spatial factors enhances the spatial heterogeneity, especially in terms of relative risk.
基金Supported by the Open Researches Fund Program of L IESMARS(WKL(0 0 ) 0 30 2 )
文摘Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.
基金This work is supported by University IT Research Center ProjectKorea.
文摘A shared nothing spatial database cluster is system that provides continuous service even if some system failure happens in any node. So, an efficient recovery of system failure is very important. Generally, the existing method recovers the failed node by using both cluster log and local log. This method, however, cause several problems that increase communication cost and size of cluster log. This paper proposes novel recovery method using recently updated record information in shared nothing spatial database cluster. The proposed technique utilizes update information of records and pointers of actual data. This makes a reduction of log size and communication cost. Consequently, this reduces recovery time of failed node due to less processing of update operations.
文摘内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。
文摘针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。