Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims...Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims to elevate the efficiency and precision of data stream clustering,leveraging the TEDA(Typicality and Eccentricity Data Analysis)algorithm as a foundation,we introduce improvements by integrating a nearest neighbor search algorithm to enhance both the efficiency and accuracy of the algorithm.The original TEDA algorithm,grounded in the concept of“Typicality and Eccentricity Data Analytics”,represents an evolving and recursive method that requires no prior knowledge.While the algorithm autonomously creates and merges clusters as new data arrives,its efficiency is significantly hindered by the need to traverse all existing clusters upon the arrival of further data.This work presents the NS-TEDA(Neighbor Search Based Typicality and Eccentricity Data Analysis)algorithm by incorporating a KD-Tree(K-Dimensional Tree)algorithm integrated with the Scapegoat Tree.Upon arrival,this ensures that new data points interact solely with clusters in very close proximity.This significantly enhances algorithm efficiency while preventing a single data point from joining too many clusters and mitigating the merging of clusters with high overlap to some extent.We apply the NS-TEDA algorithm to several well-known datasets,comparing its performance with other data stream clustering algorithms and the original TEDA algorithm.The results demonstrate that the proposed algorithm achieves higher accuracy,and its runtime exhibits almost linear dependence on the volume of data,making it more suitable for large-scale data stream analysis research.展开更多
为解决均值漂移聚类算法聚类效果依赖于带宽参数的主观选取,以及处理密度变化大的数据集时聚类结果精确度问题,提出一种基于覆盖树的自适应均值漂移聚类算法MSCT(MeanShift based on Cover-Tree)。构建一个覆盖树数据集,在计算漂移向量...为解决均值漂移聚类算法聚类效果依赖于带宽参数的主观选取,以及处理密度变化大的数据集时聚类结果精确度问题,提出一种基于覆盖树的自适应均值漂移聚类算法MSCT(MeanShift based on Cover-Tree)。构建一个覆盖树数据集,在计算漂移向量过程中结合覆盖树数据集获得新的漂移向量结果KnnShift,在不同数据密度分布的数据集上都能自适应产生带宽参数,所有数据点完成漂移过程后获得聚类结果。实验结果表明,MSCT算法的聚类效果整体上优于MS、DBSCAN等算法。展开更多
为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点...为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点云归一化的基础上对林分点云垂直分段,然后采用DBSCAN算法垂直分段聚类,再计算每个垂直分段中每个簇的中心点,根据簇中心点间的距离判定簇间的相邻关系,并由此匹配树干段点云,最后采用RANSAC(Random Sample Consensus)算法对树干段点云拟合直线,并根据点与拟合直线间的距离判定点的归属以实现树木分割。在郁闭度分别为中与高的林分中,所提算法的调和值F范围分别为0.88~0.99与0.72~0.74,基于距离判别的树木分割算法的F范围分别为0.84~0.90与0.73~0.79。所提算法在不同郁闭度的林分点云中均能有效分割单株树木点云,特别是在郁闭度为中的林分中有较好表现,可实现对林分点云的精确树木分割。展开更多
多传感器建图与定位SLAM系统(simultaneous localization and mapping)在室外长距离跨度环境中,由于各传感器信息融合不正确、特征匹配错误,或传感器状态信息不可信,导致建图精度不足,轨迹漂移甚至建图崩溃。对此,提出一种基于因子图优...多传感器建图与定位SLAM系统(simultaneous localization and mapping)在室外长距离跨度环境中,由于各传感器信息融合不正确、特征匹配错误,或传感器状态信息不可信,导致建图精度不足,轨迹漂移甚至建图崩溃。对此,提出一种基于因子图优化的多传感器信息紧耦合算法(tightly-coupled lidar-visual-inertial odometry via smoothing,mapping and DBSCAN,LVI-SMAD),将前端点云和视觉信息联合的聚类结果作为因子图优化约束,以一种较低帧的约束形式加入到较高帧的点云地图输出中,加强了点云与视觉信息的紧耦合,解决了激光雷达与相机间信息匹配错误的问题,同时将该约束作为某一传感器信息不可信时的约束补充,减小了传感器信息不稳定情况下的定位漂移,提高了算法一致性。实验证明,在低坡度长跨度的工作环境中,LVI-SMAD与LVI-SAM对比,绝对轨迹误差降低了39.90%,与LIO-SAM对比降低了63.09%;在高坡度工作环境中,与LVI-SAM对比,绝对轨迹误差减少41.08%,与LIO-SAM对比减少64.87%,证明了算法的有效性与可行性。展开更多
基于Petersen图,提出了Binary Tree Petersen的网络结构,并对其特性进行了研究,证明了Binary Tree Petersen网络具有正则性以及良好的可扩展性,同时还具有比RP(k)、2-DToms更短的直径和良好的并行能力.另外,还基于Binary Tree P...基于Petersen图,提出了Binary Tree Petersen的网络结构,并对其特性进行了研究,证明了Binary Tree Petersen网络具有正则性以及良好的可扩展性,同时还具有比RP(k)、2-DToms更短的直径和良好的并行能力.另外,还基于Binary Tree Petersen网络分别给出了其上的单播和广播路由算法,证明了通信效率都为2j+4.展开更多
基金This research was funded by the National Natural Science Foundation of China(Grant No.72001190)by the Ministry of Education’s Humanities and Social Science Project via the China Ministry of Education(Grant No.20YJC630173)by Zhejiang A&F University(Grant No.2022LFR062).
文摘Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims to elevate the efficiency and precision of data stream clustering,leveraging the TEDA(Typicality and Eccentricity Data Analysis)algorithm as a foundation,we introduce improvements by integrating a nearest neighbor search algorithm to enhance both the efficiency and accuracy of the algorithm.The original TEDA algorithm,grounded in the concept of“Typicality and Eccentricity Data Analytics”,represents an evolving and recursive method that requires no prior knowledge.While the algorithm autonomously creates and merges clusters as new data arrives,its efficiency is significantly hindered by the need to traverse all existing clusters upon the arrival of further data.This work presents the NS-TEDA(Neighbor Search Based Typicality and Eccentricity Data Analysis)algorithm by incorporating a KD-Tree(K-Dimensional Tree)algorithm integrated with the Scapegoat Tree.Upon arrival,this ensures that new data points interact solely with clusters in very close proximity.This significantly enhances algorithm efficiency while preventing a single data point from joining too many clusters and mitigating the merging of clusters with high overlap to some extent.We apply the NS-TEDA algorithm to several well-known datasets,comparing its performance with other data stream clustering algorithms and the original TEDA algorithm.The results demonstrate that the proposed algorithm achieves higher accuracy,and its runtime exhibits almost linear dependence on the volume of data,making it more suitable for large-scale data stream analysis research.
文摘为解决均值漂移聚类算法聚类效果依赖于带宽参数的主观选取,以及处理密度变化大的数据集时聚类结果精确度问题,提出一种基于覆盖树的自适应均值漂移聚类算法MSCT(MeanShift based on Cover-Tree)。构建一个覆盖树数据集,在计算漂移向量过程中结合覆盖树数据集获得新的漂移向量结果KnnShift,在不同数据密度分布的数据集上都能自适应产生带宽参数,所有数据点完成漂移过程后获得聚类结果。实验结果表明,MSCT算法的聚类效果整体上优于MS、DBSCAN等算法。
文摘为快速准确地提取地面三维激光扫描仪获取林分点云中的单株树木点云,提出一种基于密度的抗噪空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)的树木分割算法。首先采用高斯滤波对林分点云去噪,在林分点云归一化的基础上对林分点云垂直分段,然后采用DBSCAN算法垂直分段聚类,再计算每个垂直分段中每个簇的中心点,根据簇中心点间的距离判定簇间的相邻关系,并由此匹配树干段点云,最后采用RANSAC(Random Sample Consensus)算法对树干段点云拟合直线,并根据点与拟合直线间的距离判定点的归属以实现树木分割。在郁闭度分别为中与高的林分中,所提算法的调和值F范围分别为0.88~0.99与0.72~0.74,基于距离判别的树木分割算法的F范围分别为0.84~0.90与0.73~0.79。所提算法在不同郁闭度的林分点云中均能有效分割单株树木点云,特别是在郁闭度为中的林分中有较好表现,可实现对林分点云的精确树木分割。
文摘多传感器建图与定位SLAM系统(simultaneous localization and mapping)在室外长距离跨度环境中,由于各传感器信息融合不正确、特征匹配错误,或传感器状态信息不可信,导致建图精度不足,轨迹漂移甚至建图崩溃。对此,提出一种基于因子图优化的多传感器信息紧耦合算法(tightly-coupled lidar-visual-inertial odometry via smoothing,mapping and DBSCAN,LVI-SMAD),将前端点云和视觉信息联合的聚类结果作为因子图优化约束,以一种较低帧的约束形式加入到较高帧的点云地图输出中,加强了点云与视觉信息的紧耦合,解决了激光雷达与相机间信息匹配错误的问题,同时将该约束作为某一传感器信息不可信时的约束补充,减小了传感器信息不稳定情况下的定位漂移,提高了算法一致性。实验证明,在低坡度长跨度的工作环境中,LVI-SMAD与LVI-SAM对比,绝对轨迹误差降低了39.90%,与LIO-SAM对比降低了63.09%;在高坡度工作环境中,与LVI-SAM对比,绝对轨迹误差减少41.08%,与LIO-SAM对比减少64.87%,证明了算法的有效性与可行性。
文摘基于Petersen图,提出了Binary Tree Petersen的网络结构,并对其特性进行了研究,证明了Binary Tree Petersen网络具有正则性以及良好的可扩展性,同时还具有比RP(k)、2-DToms更短的直径和良好的并行能力.另外,还基于Binary Tree Petersen网络分别给出了其上的单播和广播路由算法,证明了通信效率都为2j+4.