内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中...内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。展开更多
为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of a...为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法,建立船舶会遇识别模型。在DBSCAN算法对邻域内的船舶数量进行统计时,计算船舶间的最近会遇距离(distance to closest point of approach,DCPA)和最近会遇时间(time to closest point of approach,TCPA),初步筛选邻域内的噪声点;基于模糊综合评价模型计算船舶会遇风险,对邻域内的船舶进行二次筛选,实现船舶会遇态势的提取。结果表明:改进后的DBSCAN算法过滤掉传统DBSCAN算法识别到的非会遇局面,并且在同一会遇局面下的船舶数量均保持在4艘以内;输出的会遇船舶风险演变趋势对实际水域内高风险船舶的监控适用性较好,能有效辅助船舶避碰。所提识别模型对保障航行安全和提高海事监管效率具有重要意义。展开更多
针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行...针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。展开更多
为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点...为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点云归一化的基础上对林分点云垂直分段,然后采用DBSCAN算法垂直分段聚类,再计算每个垂直分段中每个簇的中心点,根据簇中心点间的距离判定簇间的相邻关系,并由此匹配树干段点云,最后采用RANSAC(Random Sample Consensus)算法对树干段点云拟合直线,并根据点与拟合直线间的距离判定点的归属以实现树木分割。在郁闭度分别为中与高的林分中,所提算法的调和值F范围分别为0.88~0.99与0.72~0.74,基于距离判别的树木分割算法的F范围分别为0.84~0.90与0.73~0.79。所提算法在不同郁闭度的林分点云中均能有效分割单株树木点云,特别是在郁闭度为中的林分中有较好表现,可实现对林分点云的精确树木分割。展开更多
There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteri...There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.展开更多
终端区空域环境复杂、航班密集,精确的航迹预测能极大地提高空中交通服务水平,保障航班飞行安全。针对终端区的高精度多航班4D航迹预测问题,本文提出了一种基于密度的带噪声空间聚类算法(Density-Based Spatial Clustering of Applicati...终端区空域环境复杂、航班密集,精确的航迹预测能极大地提高空中交通服务水平,保障航班飞行安全。针对终端区的高精度多航班4D航迹预测问题,本文提出了一种基于密度的带噪声空间聚类算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)和门控循环单元(Gated Recurrent Unit,GRU)相结合的航迹预测方法,通过DBSCAN聚类,将终端区中航迹相近的航班聚类到一簇中,对每一簇航班建立基于GRU神经网络的航迹预测模型,对终端区航班进行预测时,先判断该航班属于哪一簇,然后采用与该簇对应的航迹预测模型,进行4D航迹预测。与仅研究单一航班的传统预测方法相比,本算法有效地利用了终端区的航迹数据,所建模型可以针对多架航班进行航迹预测,扩大了模型的适用范围,提高了航迹预测的预测精度。展开更多
Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformat...Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.展开更多
As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural sp...As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.展开更多
Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structu...Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structure make single algorithms perform badly for different parts of data. More intensive parts are assumed to have more information probably,an algorithm clustering from high density part is proposed,which begins from a tiny distance to find the highest density-connected partition and form corresponding super cores,then distance is iteratively increased by a global heuristic method to cluster parts with different densities. Mean of silhouette coefficient indicates the cluster performance. Denoising function is implemented to eliminate influence of noise and outliers. Many challenging experiments indicate that the algorithm has good performance on data with widely varying densities and extremely complex structures. It decides the optimal number of clusters automatically.Background knowledge is not needed and parameters tuning is easy. It is robust against noise and outliers.展开更多
具有噪声的基于密度的空间聚类(Density‑based spatial clustering of applications with noise,DBSCAN)能够发现不同密度和大小的类簇,对噪声也有很好的鲁棒性,被广泛地应用到数据挖掘的任务中。DBSCAN通常需要调整参数MinPts和Eps以...具有噪声的基于密度的空间聚类(Density‑based spatial clustering of applications with noise,DBSCAN)能够发现不同密度和大小的类簇,对噪声也有很好的鲁棒性,被广泛地应用到数据挖掘的任务中。DBSCAN通常需要调整参数MinPts和Eps以达到更优的聚类效果,但往往在搜索最优参数的过程中会影响DBSCAN的性能。本文从两个方面优化DBSCAN,一方面,提出一种无参的方法优化DBSCAN全局参数选择。无参方法利用自然最近邻获得数据集的自然特征值,并将自然特征值作为参数MinPts值。然后,根据自然特征值计算自然特征集合,利用自然特征集合中的数据分布特性,分别采取统计最小值、平均值和最大值3种方式得到Eps值。另一方面,采用集成数据科学实时加速平台(Real‑time acceleration platform for integrated data science,RAPIDS)的图形处理器(Graphics processing unit,GPU)计算加快DBSCAN算法的收敛速度。实验结果表明,本文提出的方法在优化DBSCAN参数选择的同时,取得了与密度峰值聚类(Density peaks clustering,DPC)相当的聚类结果。展开更多
针对欠定盲源分离中混合矩阵估计精度不佳的问题,本文提出了一种结合带噪声的基于密度的空间聚类(combining density-based spatial clustering of application with noise,DBSCAN)和概率密度估计的混合矩阵估计算法。首先,通过向量转...针对欠定盲源分离中混合矩阵估计精度不佳的问题,本文提出了一种结合带噪声的基于密度的空间聚类(combining density-based spatial clustering of application with noise,DBSCAN)和概率密度估计的混合矩阵估计算法。首先,通过向量转换方式获得单声源时频点检测准则,并基于此准则从混合信号中检测出单声源点。其次,利用基于密度的空间聚类算法对单声源点进行聚类,由此估计出声源个数以及各类别所属的单声源点。再次,利用概率密度估计获得各类别的聚类中心,并构成混合矩阵。所提混合矩阵估计方法不需要提前设定声源个数,并且避免了由于数据分布不均所造成的聚类效果差的问题。最后,采用压缩感知技术实现源信号恢复,从而从混合信号中分离出各个声源信号。实验结果表明,本文所提的混合矩阵估计方法在声源个数未知的情况下,能够准确估计出混合矩阵;并且分离出的信号具有较高的质量。展开更多
单测点校正法计算复杂、不稳定、误差较大,无法满足井下地磁方位角的精度要求。基于间接单测点分析法和Brooks多测点分析法,提出一种新方法,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)算法识别并剔除噪...单测点校正法计算复杂、不稳定、误差较大,无法满足井下地磁方位角的精度要求。基于间接单测点分析法和Brooks多测点分析法,提出一种新方法,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)算法识别并剔除噪声,采用椭圆校正法校正径向干扰,轴向干扰则由单轴多测点分析法校正。实验证明:改进多测点法不仅可以进一步提高椭圆校正的拟合效果,还能降低噪声对参考点计算值的影响,计算得到的方差曲线收敛性更强、更稳定,校正后方位角误差进一步降低。展开更多
针对海战场环境下态势评估中目标数量多、类型复杂多样的问题,首先引入数据聚类对态势评估的目标分群环节进行聚类分群,提出基于DBSCAN(density-based spatial clustering of applications with noise)算法的密度聚类,可聚类任意形状的...针对海战场环境下态势评估中目标数量多、类型复杂多样的问题,首先引入数据聚类对态势评估的目标分群环节进行聚类分群,提出基于DBSCAN(density-based spatial clustering of applications with noise)算法的密度聚类,可聚类任意形状的数据簇,遍历性好,能够对战场环境下目标进行全面合理的分群;然后,给出了算法计算的基本步骤,并利用算例对已知战场态势的目标群进行正确性验证;最后,将该算法与基于划分的K-means算法、基于层次的AGNES(AGglomerative NESting)算法进行了对比分析,证明了该算法的有效性和合理性。展开更多
针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚...针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚类算法对上下客点进行聚类,得到出租车的载客热点,根据POI的类型划定载客热点区域的类型,对出租车不同时间的出行需求进行分析,进而划分出出租车的固定停车区域。研究结果表明,出租车固定停车区域的设定与出行者的出行需求有关,即将固定停车区域设置在出行者出行需求多的区域,可以满足出行者的不同出行需求。结合出租车载客热点和爬取POI数据划定固定停车区域的方法具有较高的实用性,可为城市交通安全方面提供理论和现实意义。展开更多
文摘内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。
文摘为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法,建立船舶会遇识别模型。在DBSCAN算法对邻域内的船舶数量进行统计时,计算船舶间的最近会遇距离(distance to closest point of approach,DCPA)和最近会遇时间(time to closest point of approach,TCPA),初步筛选邻域内的噪声点;基于模糊综合评价模型计算船舶会遇风险,对邻域内的船舶进行二次筛选,实现船舶会遇态势的提取。结果表明:改进后的DBSCAN算法过滤掉传统DBSCAN算法识别到的非会遇局面,并且在同一会遇局面下的船舶数量均保持在4艘以内;输出的会遇船舶风险演变趋势对实际水域内高风险船舶的监控适用性较好,能有效辅助船舶避碰。所提识别模型对保障航行安全和提高海事监管效率具有重要意义。
文摘针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。
文摘为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点云归一化的基础上对林分点云垂直分段,然后采用DBSCAN算法垂直分段聚类,再计算每个垂直分段中每个簇的中心点,根据簇中心点间的距离判定簇间的相邻关系,并由此匹配树干段点云,最后采用RANSAC(Random Sample Consensus)算法对树干段点云拟合直线,并根据点与拟合直线间的距离判定点的归属以实现树木分割。在郁闭度分别为中与高的林分中,所提算法的调和值F范围分别为0.88~0.99与0.72~0.74,基于距离判别的树木分割算法的F范围分别为0.84~0.90与0.73~0.79。所提算法在不同郁闭度的林分点云中均能有效分割单株树木点云,特别是在郁闭度为中的林分中有较好表现,可实现对林分点云的精确树木分割。
基金Under the auspices of National Social Science Foundation of China (No.21BJY202)。
文摘There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.
文摘终端区空域环境复杂、航班密集,精确的航迹预测能极大地提高空中交通服务水平,保障航班飞行安全。针对终端区的高精度多航班4D航迹预测问题,本文提出了一种基于密度的带噪声空间聚类算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)和门控循环单元(Gated Recurrent Unit,GRU)相结合的航迹预测方法,通过DBSCAN聚类,将终端区中航迹相近的航班聚类到一簇中,对每一簇航班建立基于GRU神经网络的航迹预测模型,对终端区航班进行预测时,先判断该航班属于哪一簇,然后采用与该簇对应的航迹预测模型,进行4D航迹预测。与仅研究单一航班的传统预测方法相比,本算法有效地利用了终端区的航迹数据,所建模型可以针对多架航班进行航迹预测,扩大了模型的适用范围,提高了航迹预测的预测精度。
基金Professor Hong Yu at Intelligent Fishery Innovative Team(No.C202109)in School of Information Engineering of Dalian Ocean University for her support of this workfunded by the National Natural Science Foundation of China(No.31800615 and No.21933010)。
文摘Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.
基金Under the auspices of National Natural Science Foundation of China(No.41271179)。
文摘As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.
基金Supported by the National Key Research and Development Program of China(No.2016YFB0201305)National Science and Technology Major Project(No.2013ZX0102-8001-001-001)National Natural Science Foundation of China(No.91430218,31327901,61472395,61272134,61432018)
文摘Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structure make single algorithms perform badly for different parts of data. More intensive parts are assumed to have more information probably,an algorithm clustering from high density part is proposed,which begins from a tiny distance to find the highest density-connected partition and form corresponding super cores,then distance is iteratively increased by a global heuristic method to cluster parts with different densities. Mean of silhouette coefficient indicates the cluster performance. Denoising function is implemented to eliminate influence of noise and outliers. Many challenging experiments indicate that the algorithm has good performance on data with widely varying densities and extremely complex structures. It decides the optimal number of clusters automatically.Background knowledge is not needed and parameters tuning is easy. It is robust against noise and outliers.
文摘具有噪声的基于密度的空间聚类(Density‑based spatial clustering of applications with noise,DBSCAN)能够发现不同密度和大小的类簇,对噪声也有很好的鲁棒性,被广泛地应用到数据挖掘的任务中。DBSCAN通常需要调整参数MinPts和Eps以达到更优的聚类效果,但往往在搜索最优参数的过程中会影响DBSCAN的性能。本文从两个方面优化DBSCAN,一方面,提出一种无参的方法优化DBSCAN全局参数选择。无参方法利用自然最近邻获得数据集的自然特征值,并将自然特征值作为参数MinPts值。然后,根据自然特征值计算自然特征集合,利用自然特征集合中的数据分布特性,分别采取统计最小值、平均值和最大值3种方式得到Eps值。另一方面,采用集成数据科学实时加速平台(Real‑time acceleration platform for integrated data science,RAPIDS)的图形处理器(Graphics processing unit,GPU)计算加快DBSCAN算法的收敛速度。实验结果表明,本文提出的方法在优化DBSCAN参数选择的同时,取得了与密度峰值聚类(Density peaks clustering,DPC)相当的聚类结果。
文摘针对欠定盲源分离中混合矩阵估计精度不佳的问题,本文提出了一种结合带噪声的基于密度的空间聚类(combining density-based spatial clustering of application with noise,DBSCAN)和概率密度估计的混合矩阵估计算法。首先,通过向量转换方式获得单声源时频点检测准则,并基于此准则从混合信号中检测出单声源点。其次,利用基于密度的空间聚类算法对单声源点进行聚类,由此估计出声源个数以及各类别所属的单声源点。再次,利用概率密度估计获得各类别的聚类中心,并构成混合矩阵。所提混合矩阵估计方法不需要提前设定声源个数,并且避免了由于数据分布不均所造成的聚类效果差的问题。最后,采用压缩感知技术实现源信号恢复,从而从混合信号中分离出各个声源信号。实验结果表明,本文所提的混合矩阵估计方法在声源个数未知的情况下,能够准确估计出混合矩阵;并且分离出的信号具有较高的质量。
文摘单测点校正法计算复杂、不稳定、误差较大,无法满足井下地磁方位角的精度要求。基于间接单测点分析法和Brooks多测点分析法,提出一种新方法,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)算法识别并剔除噪声,采用椭圆校正法校正径向干扰,轴向干扰则由单轴多测点分析法校正。实验证明:改进多测点法不仅可以进一步提高椭圆校正的拟合效果,还能降低噪声对参考点计算值的影响,计算得到的方差曲线收敛性更强、更稳定,校正后方位角误差进一步降低。
文摘针对海战场环境下态势评估中目标数量多、类型复杂多样的问题,首先引入数据聚类对态势评估的目标分群环节进行聚类分群,提出基于DBSCAN(density-based spatial clustering of applications with noise)算法的密度聚类,可聚类任意形状的数据簇,遍历性好,能够对战场环境下目标进行全面合理的分群;然后,给出了算法计算的基本步骤,并利用算例对已知战场态势的目标群进行正确性验证;最后,将该算法与基于划分的K-means算法、基于层次的AGNES(AGglomerative NESting)算法进行了对比分析,证明了该算法的有效性和合理性。
文摘针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚类算法对上下客点进行聚类,得到出租车的载客热点,根据POI的类型划定载客热点区域的类型,对出租车不同时间的出行需求进行分析,进而划分出出租车的固定停车区域。研究结果表明,出租车固定停车区域的设定与出行者的出行需求有关,即将固定停车区域设置在出行者出行需求多的区域,可以满足出行者的不同出行需求。结合出租车载客热点和爬取POI数据划定固定停车区域的方法具有较高的实用性,可为城市交通安全方面提供理论和现实意义。