内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中...内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。展开更多
为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of a...为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法,建立船舶会遇识别模型。在DBSCAN算法对邻域内的船舶数量进行统计时,计算船舶间的最近会遇距离(distance to closest point of approach,DCPA)和最近会遇时间(time to closest point of approach,TCPA),初步筛选邻域内的噪声点;基于模糊综合评价模型计算船舶会遇风险,对邻域内的船舶进行二次筛选,实现船舶会遇态势的提取。结果表明:改进后的DBSCAN算法过滤掉传统DBSCAN算法识别到的非会遇局面,并且在同一会遇局面下的船舶数量均保持在4艘以内;输出的会遇船舶风险演变趋势对实际水域内高风险船舶的监控适用性较好,能有效辅助船舶避碰。所提识别模型对保障航行安全和提高海事监管效率具有重要意义。展开更多
针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行...针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。展开更多
为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点...为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点云归一化的基础上对林分点云垂直分段,然后采用DBSCAN算法垂直分段聚类,再计算每个垂直分段中每个簇的中心点,根据簇中心点间的距离判定簇间的相邻关系,并由此匹配树干段点云,最后采用RANSAC(Random Sample Consensus)算法对树干段点云拟合直线,并根据点与拟合直线间的距离判定点的归属以实现树木分割。在郁闭度分别为中与高的林分中,所提算法的调和值F范围分别为0.88~0.99与0.72~0.74,基于距离判别的树木分割算法的F范围分别为0.84~0.90与0.73~0.79。所提算法在不同郁闭度的林分点云中均能有效分割单株树木点云,特别是在郁闭度为中的林分中有较好表现,可实现对林分点云的精确树木分割。展开更多
There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteri...There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.展开更多
Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni...Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.展开更多
探讨城市住宿业的空间分布格局,对城市产业格局优化及其有序发展具有重要意义.本文以乌鲁木齐市为例,选取其住宿设施为研究对象,通过高德地图API获取乌鲁木齐市7区1县1948个POI(point of interest)数据,并运用DBSCAN(density-based spat...探讨城市住宿业的空间分布格局,对城市产业格局优化及其有序发展具有重要意义.本文以乌鲁木齐市为例,选取其住宿设施为研究对象,通过高德地图API获取乌鲁木齐市7区1县1948个POI(point of interest)数据,并运用DBSCAN(density-based spatial clustering of applications with noise)算法识别其核心集群,揭示该市住宿业集群空间分布特征及影响因素.研究结果显示:1)乌鲁木齐市住宿业集群分为5个等级,形成了鲜明的“一主一次”的住宿业集群分布格局,且可归纳为基于城区基础设施的综合型和基于城郊自然景观的旅游型2种核心集群发展模式;2)乌鲁木齐市住宿业分布呈“南-北”走向,形成了“一点一带一团一簇”的分布模式;3)商场超市密度、路网密度、公交和地铁站点密度是乌鲁木齐市住宿业集群空间分布的重要影响因素.展开更多
As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural sp...As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.展开更多
具有噪声的基于密度的空间聚类(Density‑based spatial clustering of applications with noise,DBSCAN)能够发现不同密度和大小的类簇,对噪声也有很好的鲁棒性,被广泛地应用到数据挖掘的任务中。DBSCAN通常需要调整参数MinPts和Eps以...具有噪声的基于密度的空间聚类(Density‑based spatial clustering of applications with noise,DBSCAN)能够发现不同密度和大小的类簇,对噪声也有很好的鲁棒性,被广泛地应用到数据挖掘的任务中。DBSCAN通常需要调整参数MinPts和Eps以达到更优的聚类效果,但往往在搜索最优参数的过程中会影响DBSCAN的性能。本文从两个方面优化DBSCAN,一方面,提出一种无参的方法优化DBSCAN全局参数选择。无参方法利用自然最近邻获得数据集的自然特征值,并将自然特征值作为参数MinPts值。然后,根据自然特征值计算自然特征集合,利用自然特征集合中的数据分布特性,分别采取统计最小值、平均值和最大值3种方式得到Eps值。另一方面,采用集成数据科学实时加速平台(Real‑time acceleration platform for integrated data science,RAPIDS)的图形处理器(Graphics processing unit,GPU)计算加快DBSCAN算法的收敛速度。实验结果表明,本文提出的方法在优化DBSCAN参数选择的同时,取得了与密度峰值聚类(Density peaks clustering,DPC)相当的聚类结果。展开更多
As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms ...As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise(DBSCAN).Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data.In this paper,we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm.The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm.To solve this problem,we developed the Genetic algorithm-based Keyword Matching DBSCAN(GKM-DBSCAN)algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword.In order to improve the performance of GKM-DBSCAN,we improved the general genetic algorithm by performing a genetic operation in groups.We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method.The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times.GKM-DBSCAN with the index number binarization(INB)is 1.8 times faster than GKM-DBSCAN with the cluster number binarization(CNB).展开更多
单测点校正法计算复杂、不稳定、误差较大,无法满足井下地磁方位角的精度要求。基于间接单测点分析法和Brooks多测点分析法,提出一种新方法,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)算法识别并剔除噪...单测点校正法计算复杂、不稳定、误差较大,无法满足井下地磁方位角的精度要求。基于间接单测点分析法和Brooks多测点分析法,提出一种新方法,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)算法识别并剔除噪声,采用椭圆校正法校正径向干扰,轴向干扰则由单轴多测点分析法校正。实验证明:改进多测点法不仅可以进一步提高椭圆校正的拟合效果,还能降低噪声对参考点计算值的影响,计算得到的方差曲线收敛性更强、更稳定,校正后方位角误差进一步降低。展开更多
针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚...针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚类算法对上下客点进行聚类,得到出租车的载客热点,根据POI的类型划定载客热点区域的类型,对出租车不同时间的出行需求进行分析,进而划分出出租车的固定停车区域。研究结果表明,出租车固定停车区域的设定与出行者的出行需求有关,即将固定停车区域设置在出行者出行需求多的区域,可以满足出行者的不同出行需求。结合出租车载客热点和爬取POI数据划定固定停车区域的方法具有较高的实用性,可为城市交通安全方面提供理论和现实意义。展开更多
为了解决判别聚落群过于依赖考古专家人工划分的问题,以郑洛地区新石器时代聚落遗址为例,采用基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法对聚落遗址进行空间聚类研究。通过对郑洛地区四个文...为了解决判别聚落群过于依赖考古专家人工划分的问题,以郑洛地区新石器时代聚落遗址为例,采用基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法对聚落遗址进行空间聚类研究。通过对郑洛地区四个文化时期聚落遗址的分布分析,发现郑洛地区的主体聚落群从研究区东部的嵩山以南地区,转移到郑洛地区中部的伊洛河流域,并且在伊洛河流域长期定居下来,不断发展扩大;大型聚落遗址主要分布在主体聚落群里,除了裴李岗文化时期部分大型聚落较孤立;从仰韶文化后期到龙山文化时期,聚落遗址分布呈主从式环状分布格局;大多数聚落群的走向都和河流分布一致。研究表明,利用DBSCAN算法进行聚落遗址聚类是可行的,通过聚类得到郑洛地区新石器时代四个文化时期聚落遗址的分布特征。展开更多
文摘内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。
文摘为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法,建立船舶会遇识别模型。在DBSCAN算法对邻域内的船舶数量进行统计时,计算船舶间的最近会遇距离(distance to closest point of approach,DCPA)和最近会遇时间(time to closest point of approach,TCPA),初步筛选邻域内的噪声点;基于模糊综合评价模型计算船舶会遇风险,对邻域内的船舶进行二次筛选,实现船舶会遇态势的提取。结果表明:改进后的DBSCAN算法过滤掉传统DBSCAN算法识别到的非会遇局面,并且在同一会遇局面下的船舶数量均保持在4艘以内;输出的会遇船舶风险演变趋势对实际水域内高风险船舶的监控适用性较好,能有效辅助船舶避碰。所提识别模型对保障航行安全和提高海事监管效率具有重要意义。
文摘针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。
文摘为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点云归一化的基础上对林分点云垂直分段,然后采用DBSCAN算法垂直分段聚类,再计算每个垂直分段中每个簇的中心点,根据簇中心点间的距离判定簇间的相邻关系,并由此匹配树干段点云,最后采用RANSAC(Random Sample Consensus)算法对树干段点云拟合直线,并根据点与拟合直线间的距离判定点的归属以实现树木分割。在郁闭度分别为中与高的林分中,所提算法的调和值F范围分别为0.88~0.99与0.72~0.74,基于距离判别的树木分割算法的F范围分别为0.84~0.90与0.73~0.79。所提算法在不同郁闭度的林分点云中均能有效分割单株树木点云,特别是在郁闭度为中的林分中有较好表现,可实现对林分点云的精确树木分割。
基金Under the auspices of National Social Science Foundation of China (No.21BJY202)。
文摘There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.
基金Supported by the Open Researches Fund Program of L IESMARS(WKL(0 0 ) 0 30 2 )
文摘Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.
文摘探讨城市住宿业的空间分布格局,对城市产业格局优化及其有序发展具有重要意义.本文以乌鲁木齐市为例,选取其住宿设施为研究对象,通过高德地图API获取乌鲁木齐市7区1县1948个POI(point of interest)数据,并运用DBSCAN(density-based spatial clustering of applications with noise)算法识别其核心集群,揭示该市住宿业集群空间分布特征及影响因素.研究结果显示:1)乌鲁木齐市住宿业集群分为5个等级,形成了鲜明的“一主一次”的住宿业集群分布格局,且可归纳为基于城区基础设施的综合型和基于城郊自然景观的旅游型2种核心集群发展模式;2)乌鲁木齐市住宿业分布呈“南-北”走向,形成了“一点一带一团一簇”的分布模式;3)商场超市密度、路网密度、公交和地铁站点密度是乌鲁木齐市住宿业集群空间分布的重要影响因素.
基金Under the auspices of National Natural Science Foundation of China(No.41271179)。
文摘As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.
文摘具有噪声的基于密度的空间聚类(Density‑based spatial clustering of applications with noise,DBSCAN)能够发现不同密度和大小的类簇,对噪声也有很好的鲁棒性,被广泛地应用到数据挖掘的任务中。DBSCAN通常需要调整参数MinPts和Eps以达到更优的聚类效果,但往往在搜索最优参数的过程中会影响DBSCAN的性能。本文从两个方面优化DBSCAN,一方面,提出一种无参的方法优化DBSCAN全局参数选择。无参方法利用自然最近邻获得数据集的自然特征值,并将自然特征值作为参数MinPts值。然后,根据自然特征值计算自然特征集合,利用自然特征集合中的数据分布特性,分别采取统计最小值、平均值和最大值3种方式得到Eps值。另一方面,采用集成数据科学实时加速平台(Real‑time acceleration platform for integrated data science,RAPIDS)的图形处理器(Graphics processing unit,GPU)计算加快DBSCAN算法的收敛速度。实验结果表明,本文提出的方法在优化DBSCAN参数选择的同时,取得了与密度峰值聚类(Density peaks clustering,DPC)相当的聚类结果。
基金supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korea government (MSIT) (No.2021R1F1A1049387).
文摘As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise(DBSCAN).Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data.In this paper,we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm.The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm.To solve this problem,we developed the Genetic algorithm-based Keyword Matching DBSCAN(GKM-DBSCAN)algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword.In order to improve the performance of GKM-DBSCAN,we improved the general genetic algorithm by performing a genetic operation in groups.We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method.The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times.GKM-DBSCAN with the index number binarization(INB)is 1.8 times faster than GKM-DBSCAN with the cluster number binarization(CNB).
文摘单测点校正法计算复杂、不稳定、误差较大,无法满足井下地磁方位角的精度要求。基于间接单测点分析法和Brooks多测点分析法,提出一种新方法,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)算法识别并剔除噪声,采用椭圆校正法校正径向干扰,轴向干扰则由单轴多测点分析法校正。实验证明:改进多测点法不仅可以进一步提高椭圆校正的拟合效果,还能降低噪声对参考点计算值的影响,计算得到的方差曲线收敛性更强、更稳定,校正后方位角误差进一步降低。
文摘针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚类算法对上下客点进行聚类,得到出租车的载客热点,根据POI的类型划定载客热点区域的类型,对出租车不同时间的出行需求进行分析,进而划分出出租车的固定停车区域。研究结果表明,出租车固定停车区域的设定与出行者的出行需求有关,即将固定停车区域设置在出行者出行需求多的区域,可以满足出行者的不同出行需求。结合出租车载客热点和爬取POI数据划定固定停车区域的方法具有较高的实用性,可为城市交通安全方面提供理论和现实意义。
文摘为了解决判别聚落群过于依赖考古专家人工划分的问题,以郑洛地区新石器时代聚落遗址为例,采用基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法对聚落遗址进行空间聚类研究。通过对郑洛地区四个文化时期聚落遗址的分布分析,发现郑洛地区的主体聚落群从研究区东部的嵩山以南地区,转移到郑洛地区中部的伊洛河流域,并且在伊洛河流域长期定居下来,不断发展扩大;大型聚落遗址主要分布在主体聚落群里,除了裴李岗文化时期部分大型聚落较孤立;从仰韶文化后期到龙山文化时期,聚落遗址分布呈主从式环状分布格局;大多数聚落群的走向都和河流分布一致。研究表明,利用DBSCAN算法进行聚落遗址聚类是可行的,通过聚类得到郑洛地区新石器时代四个文化时期聚落遗址的分布特征。