Research on Parallel K-Medoids algorithm based on MapReduce
作者 Xianli QIN 《International Journal of Technology Management》 2015年第1期26-28,共3页
In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a ... In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a parallel algorithm MapReduce programming model based on the research of K-Medoids algorithm. This algorithm increase the computation granularity and reduces the communication cost ratio based on the MapReduce model. The experimental results show that the improved parallel algorithm compared with other algorithms, speedup and operation efficiency is greatly enhanced. 展开更多
关键词 k-medoids MAPREDUCE Parallel computing HADOOP
作者 亓振涛 《电子设计工程》 2025年第3期13-17,23,共6页
为了提高梯级水利枢纽信息在实际工作中的利用率,提出基于K-medoids聚类处理的梯级水利枢纽信息智能整合方法。从项目信息、水文、枢纽设备等方面,采集梯级水利枢纽信息,针对不同信息类型通过清洗、归一化等步骤,完成初始信息的预处理... 为了提高梯级水利枢纽信息在实际工作中的利用率,提出基于K-medoids聚类处理的梯级水利枢纽信息智能整合方法。从项目信息、水文、枢纽设备等方面,采集梯级水利枢纽信息,针对不同信息类型通过清洗、归一化等步骤,完成初始信息的预处理。以梯级水利枢纽信息特征的提取结果为处理对象,利用K-medoids处理技术完成梯级水利枢纽信息的聚类,通过整合信息的冗余过滤,得出信息智能整合结果。通过性能测试实验得出结论:与传统整合方法相比,优化方法的完整度提高了6.06%、冗余度降低了1.79%,同时整合信息具有更高的利用率。 展开更多
关键词 k-medoids聚类处理技术 梯级水利枢纽 水利信息 信息整合
作者 戴朝辉 陈昊 +3 位作者 刘莘轶 夏长青 郭嘉毅 于立军 《太阳能学报》 北大核心 2025年第1期654-661,共8页
为保障电网供需平衡和安全稳定运行,提高大型光伏电站功率预测的精度,提出一种基于K中心点聚类算法(K-medoids)、梯度提升树(GBDT)和粒子群优化算法(PSO)组合优化的长短期记忆神经网络(LSTM)的光伏功率短期预测模型。首先,采用K-medoid... 为保障电网供需平衡和安全稳定运行,提高大型光伏电站功率预测的精度,提出一种基于K中心点聚类算法(K-medoids)、梯度提升树(GBDT)和粒子群优化算法(PSO)组合优化的长短期记忆神经网络(LSTM)的光伏功率短期预测模型。首先,采用K-medoids聚类算法对大规模光伏发电数据样本中的天气数据进行不同类别聚类,分为晴天、阴天和雨/雪天3种天气类型;然后,在已有数据基础上构造特征工程,使用GBDT算法分别进行特征重要性分析,筛选出对光伏功率预测具有显著影响的特征,并构建合适大小结构的优化数据集;最后,将重构后的数据集代入PSO算法优化的LSTM模型进行训练,以建立短期预测模型。实验结果表明,该模型拥有更高预测精度,相比单一LSTM模型,在雨/雪天下的RMSE指标降低了12.19%。 展开更多
关键词 光伏发电 功率预测 机器学习 长短期记忆网络 优化算法 粒子群算法
作者 余洋 《信息与电脑》 2024年第11期23-25,共3页
当前计算机信息处理技术在大规模数据集上存在计算效率低下、对噪声和异常值敏感等问题。为了解决这些问题,本文提出了一种改进的K-medoids聚类算法。该方法通过优化初始中心点的选择和更新策略,提高了算法的收敛速度和稳定性,并引入基... 当前计算机信息处理技术在大规模数据集上存在计算效率低下、对噪声和异常值敏感等问题。为了解决这些问题,本文提出了一种改进的K-medoids聚类算法。该方法通过优化初始中心点的选择和更新策略,提高了算法的收敛速度和稳定性,并引入基于密度的聚类评价指标,提高了对噪声数据的鲁棒性。通过在真实和人工数据集上的实验验证,证明了本方法在提高聚类效果和处理大规模数据方面的有效性。 展开更多
关键词 k-medoids 信息处理 聚类分析 技术优化
基于DTW K-medoids与VMD-多分支神经网络的多用户短期负荷预测 被引量:2
作者 王宇飞 杜桐 +3 位作者 边伟国 张钊 刘慧婷 杨丽君 《中国电力》 CSCD 北大核心 2024年第6期121-130,共10页
多用户电力负荷预测是指根据历史负荷数据对多个用户或区域的电力负荷进行预测,可使电网企业掌握不同用户或区域的电力需求,以便更好地开展规划和实施调度优化等。然而由于各用户呈现出复杂多样的用电行为,采用传统方法难以进行统一建... 多用户电力负荷预测是指根据历史负荷数据对多个用户或区域的电力负荷进行预测,可使电网企业掌握不同用户或区域的电力需求,以便更好地开展规划和实施调度优化等。然而由于各用户呈现出复杂多样的用电行为,采用传统方法难以进行统一建模并实现快速准确预测。为此,构建了一种基于DTW Kmedoids与VMD-多分支神经网络的多用户短期负荷预测模型。首先,采用DTW K-medoids法进行用户负荷数据聚类,利用动态时间弯曲(dynamic time warping,DTW)计算数据间的距离,取代K-medoids算法中传统的欧氏距离度量方式,以改善多用户负荷聚类的效果;在此基础上,为充分表征负荷历史数据的长短期时序依赖特征,建立了一种基于变分模态分解(variational mode decomposition,VMD)-多分支神经网络模型的并行预测方法,用于多用户短期负荷预测;最后,使用某地区20个用户365天的负荷数据进行聚类、训练和测试实验,结果显示该模型结果的平均绝对误差和均方根误差等指标均较对比模型有较大幅度降低,表明该方法可有效表征多类用户的用电行为,提升多用户负荷预测效率和精度。 展开更多
关键词 多用户 负荷预测 DTW k-medoids聚类 变分模态分解(VMD) 多分支神经网络
作者 郭光根 何蕊 张玉军 《科技创新与应用》 2024年第35期39-43,共5页
由于烟草物流行业在运营过程中涉及的数据来源极其广泛且多样,数据不仅格式各异、结构复杂,而且往往分散存储在不同的信息系统中,导致物流数据在集成的过程中,出现数据吞吐量较低的现象。针对上述现象,提出基于K-medoids聚类的异构环境... 由于烟草物流行业在运营过程中涉及的数据来源极其广泛且多样,数据不仅格式各异、结构复杂,而且往往分散存储在不同的信息系统中,导致物流数据在集成的过程中,出现数据吞吐量较低的现象。针对上述现象,提出基于K-medoids聚类的异构环境多源烟草物流数据集成方法。通过欠采样平衡类别分布,利用数据相关性和阈值清洗剔除冗余信息,提高异构环境多源烟草物流数据质量,设计基于K-medoids聚类的烟草物流数据集成框架,使用迁移学习动态调整源域权重以优化目标域聚类性能,引入带有相似性约束的新数据点作为初始聚类中心,实现异构环境多源烟草物流数据的有效集成。实验结果表明,设计方法通过聚类算法能够将来自不同数据源的数据进行有效分组和整合,降低数据处理的复杂性,提高数据集成的吞吐量。 展开更多
关键词 k-medoids聚类 异构环境 多源数据 烟草物流数据 数据集成方法
作者 高丽娟 王志伟 +1 位作者 李明江 曲晓慧 《计算机仿真》 2024年第11期127-131,共5页
油气工程领域的信息数据量庞大且源自多元渠道,数据分布广泛且质量参差不齐,直接整合所有数据点进行集成往往会导致信息矩阵质量退化,难以满足实际应用需求,提出基于K-medoids聚类的油气工程多源信息数据集成算法。首先,构建多源数据集... 油气工程领域的信息数据量庞大且源自多元渠道,数据分布广泛且质量参差不齐,直接整合所有数据点进行集成往往会导致信息矩阵质量退化,难以满足实际应用需求,提出基于K-medoids聚类的油气工程多源信息数据集成算法。首先,构建多源数据集,基于决策图选择多源数据代表点;然后基于最近邻近似原则混合代表策略,构建稀疏亲和子矩阵并进行稀疏化处理,结合最近代表快速近似方法获取油气工程多源信息数据的基聚类结果;最后,利用拉格朗日函数对基聚类后的结果赋权,计算聚类成本,完成油气工程多源信息数据的集成。通过实验证明:所提方法对数据集的平均迭代次数较低,CA始终保持在96%以上,NMI值保持在0.94以上,曲线平稳波动幅度较小,说明聚类集成准确性较高,效果较好。 展开更多
关键词 k-medoids聚类 多源信息数据 决策图 稀疏亲和子矩阵 基聚类
作者 刘浩杰 冯庆 +2 位作者 梁建波 何成威 吴鼎 《水利技术监督》 2024年第7期16-19,共4页
在梯级水利枢纽信息资源整合时,传统的算法只能对单源信息进行聚类分析,资源整合效率低。针对上述问题,文章提出基于K-medoids聚类算法的梯级水利枢纽信息资源整合方法。建立一个完善的整合机制,设计水利枢纽信息资源整合模型,该模型能... 在梯级水利枢纽信息资源整合时,传统的算法只能对单源信息进行聚类分析,资源整合效率低。针对上述问题,文章提出基于K-medoids聚类算法的梯级水利枢纽信息资源整合方法。建立一个完善的整合机制,设计水利枢纽信息资源整合模型,该模型能全面有效地整合各种信息资源,确定水利枢纽信息资源的利用系数,通过评估和调整该系数可以优化信息资源的配置和使用。实验证明,该方法可以提高资源整合效率,应用效果良好,具有实际应用价值。 展开更多
关键词 k-medoid聚类算法 水利枢纽信息 资源整合 利用系数
作者 杨玺 陈爽 +2 位作者 彭子睿 高镇 王安龙 《微型电脑应用》 2024年第1期80-83,共4页
为了获得较高的预测精度,提出一种基于k-Medoids聚类和深度学习的分布式短期负荷预测。基于配电变压器的能耗分布,采用k-Medoids聚类将电力负荷数据集中的数据进行聚类,并构建基于深度神经网络(DNN)和长短期记忆网络(LSTM)的短期负荷预... 为了获得较高的预测精度,提出一种基于k-Medoids聚类和深度学习的分布式短期负荷预测。基于配电变压器的能耗分布,采用k-Medoids聚类将电力负荷数据集中的数据进行聚类,并构建基于深度神经网络(DNN)和长短期记忆网络(LSTM)的短期负荷预测模型。在拥有1000个变电站数据子集的武汉配电网络系统中进行验证,验证结果表明,所提的kMedoids聚类可以在减少44%训练时间的基础上拟合出单个变压器预测模型的平均参数,且DNN和LSTM预测模型分别以7.32%和11.15%的平均绝对百分比误差(MAPE)跟踪实际负荷。 展开更多
关键词 短期负荷预测 k-medoids聚类 深度学习 深度神经网络 长短期记忆网络
A State of Art Analysis of Telecommunication Data by k-Means and k-Medoids Clustering Algorithms
作者 T. Velmurugan 《Journal of Computer and Communications》 2018年第1期190-202,共13页
Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-clus... Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-cluster similarity and low inter-cluster similarity. Clustering techniques are applied in different domains to predict future trends of available data and its uses for the real world. This research work is carried out to find the performance of two of the most delegated, partition based clustering algorithms namely k-Means and k-Medoids. A state of art analysis of these two algorithms is implemented and performance is analyzed based on their clustering result quality by means of its execution time and other components. Telecommunication data is the source data for this analysis. The connection oriented broadband data is given as input to find the clustering quality of the algorithms. Distance between the server locations and their connection is considered for clustering. Execution time for each algorithm is analyzed and the results are compared with one another. Results found in comparison study are satisfactory for the chosen application. 展开更多
Research on Euclidean Algorithm and Reection on Its Teaching
作者 ZHANG Shaohua 《应用数学》 北大核心 2025年第1期308-310,共3页
In this paper,we prove that Euclid's algorithm,Bezout's equation and Divi-sion algorithm are equivalent to each other.Our result shows that Euclid has preliminarily established the theory of divisibility and t... In this paper,we prove that Euclid's algorithm,Bezout's equation and Divi-sion algorithm are equivalent to each other.Our result shows that Euclid has preliminarily established the theory of divisibility and the greatest common divisor.We further provided several suggestions for teaching. 展开更多
关键词 Euclid's algorithm Division algorithm Bezout's equation
An Algorithm for Cloud-based Web Service Combination Optimization Through Plant Growth Simulation
作者 Li Qiang Qin Huawei +1 位作者 Qiao Bingqin Wu Ruifang 《系统仿真学报》 北大核心 2025年第2期462-473,共12页
In order to improve the efficiency of cloud-based web services,an improved plant growth simulation algorithm scheduling model.This model first used mathematical methods to describe the relationships between cloud-base... In order to improve the efficiency of cloud-based web services,an improved plant growth simulation algorithm scheduling model.This model first used mathematical methods to describe the relationships between cloud-based web services and the constraints of system resources.Then,a light-induced plant growth simulation algorithm was established.The performance of the algorithm was compared through several plant types,and the best plant model was selected as the setting for the system.Experimental results show that when the number of test cloud-based web services reaches 2048,the model being 2.14 times faster than PSO,2.8 times faster than the ant colony algorithm,2.9 times faster than the bee colony algorithm,and a remarkable 8.38 times faster than the genetic algorithm. 展开更多
关键词 cloud-based service scheduling algorithm resource constraint load optimization cloud computing plant growth simulation algorithm
Short-TermWind Power Forecast Based on STL-IAOA-iTransformer Algorithm:A Case Study in Northwest China
作者 Zhaowei Yang Bo Yang +5 位作者 Wenqi Liu Miwei Li Jiarong Wang Lin Jiang Yiyan Sang Zhenning Pan 《Energy Engineering》 2025年第2期405-430,共26页
Accurate short-term wind power forecast technique plays a crucial role in maintaining the safety and economic efficiency of smart grids.Although numerous studies have employed various methods to forecast wind power,th... Accurate short-term wind power forecast technique plays a crucial role in maintaining the safety and economic efficiency of smart grids.Although numerous studies have employed various methods to forecast wind power,there remains a research gap in leveraging swarm intelligence algorithms to optimize the hyperparameters of the Transformer model for wind power prediction.To improve the accuracy of short-term wind power forecast,this paper proposes a hybrid short-term wind power forecast approach named STL-IAOA-iTransformer,which is based on seasonal and trend decomposition using LOESS(STL)and iTransformer model optimized by improved arithmetic optimization algorithm(IAOA).First,to fully extract the power data features,STL is used to decompose the original data into components with less redundant information.The extracted components as well as the weather data are then input into iTransformer for short-term wind power forecast.The final predicted short-term wind power curve is obtained by combining the predicted components.To improve the model accuracy,IAOA is employed to optimize the hyperparameters of iTransformer.The proposed approach is validated using real-generation data from different seasons and different power stations inNorthwest China,and ablation experiments have been conducted.Furthermore,to validate the superiority of the proposed approach under different wind characteristics,real power generation data fromsouthwestChina are utilized for experiments.Thecomparative results with the other six state-of-the-art prediction models in experiments show that the proposed model well fits the true value of generation series and achieves high prediction accuracy. 展开更多
关键词 Short-termwind power forecast improved arithmetic optimization algorithm iTransformer algorithm SimuNPS
Unveiling Effective Heuristic Strategies: A Review of Cross-Domain Heuristic Search Challenge Algorithms
作者 Mohamad Khairulamirin Md Razali MasriAyob +5 位作者 Abdul Hadi Abd Rahman Razman Jarmin Chian Yong Liu Muhammad Maaya Azarinah Izaham Graham Kendall 《Computer Modeling in Engineering & Sciences》 2025年第2期1233-1288,共56页
The Cross-domain Heuristic Search Challenge(CHeSC)is a competition focused on creating efficient search algorithms adaptable to diverse problem domains.Selection hyper-heuristics are a class of algorithms that dynamic... The Cross-domain Heuristic Search Challenge(CHeSC)is a competition focused on creating efficient search algorithms adaptable to diverse problem domains.Selection hyper-heuristics are a class of algorithms that dynamically choose heuristics during the search process.Numerous selection hyper-heuristics have different imple-mentation strategies.However,comparisons between them are lacking in the literature,and previous works have not highlighted the beneficial and detrimental implementation methods of different components.The question is how to effectively employ them to produce an efficient search heuristic.Furthermore,the algorithms that competed in the inaugural CHeSC have not been collectively reviewed.This work conducts a review analysis of the top twenty competitors from this competition to identify effective and ineffective strategies influencing algorithmic performance.A summary of the main characteristics and classification of the algorithms is presented.The analysis underlines efficient and inefficient methods in eight key components,including search points,search phases,heuristic selection,move acceptance,feedback,Tabu mechanism,restart mechanism,and low-level heuristic parameter control.This review analyzes the components referencing the competition’s final leaderboard and discusses future research directions for these components.The effective approaches,identified as having the highest quality index,are mixed search point,iterated search phases,relay hybridization selection,threshold acceptance,mixed learning,Tabu heuristics,stochastic restart,and dynamic parameters.Findings are also compared with recent trends in hyper-heuristics.This work enhances the understanding of selection hyper-heuristics,offering valuable insights for researchers and practitioners aiming to develop effective search algorithms for diverse problem domains. 展开更多
关键词 HYPER-HEURISTICS search algorithms optimization heuristic selection move acceptance learning DIVERSIFICATION parameter control
Multi-Objective Hybrid Sailfish Optimization Algorithm for Planetary Gearbox and Mechanical Engineering Design Optimization Problems
作者 Miloš Sedak Maja Rosic Božidar Rosic 《Computer Modeling in Engineering & Sciences》 2025年第2期2111-2145,共35页
This paper introduces a hybrid multi-objective optimization algorithm,designated HMODESFO,which amalgamates the exploratory prowess of Differential Evolution(DE)with the rapid convergence attributes of the Sailfish Op... This paper introduces a hybrid multi-objective optimization algorithm,designated HMODESFO,which amalgamates the exploratory prowess of Differential Evolution(DE)with the rapid convergence attributes of the Sailfish Optimization(SFO)algorithm.The primary objective is to address multi-objective optimization challenges within mechanical engineering,with a specific emphasis on planetary gearbox optimization.The algorithm is equipped with the ability to dynamically select the optimal mutation operator,contingent upon an adaptive normalized population spacing parameter.The efficacy of HMODESFO has been substantiated through rigorous validation against estab-lished industry benchmarks,including a suite of Zitzler-Deb-Thiele(ZDT)and Zeb-Thiele-Laumanns-Zitzler(DTLZ)problems,where it exhibited superior performance.The outcomes underscore the algorithm’s markedly enhanced optimization capabilities relative to existing methods,particularly in tackling highly intricate multi-objective planetary gearbox optimization problems.Additionally,the performance of HMODESFO is evaluated against selected well-known mechanical engineering test problems,further accentuating its adeptness in resolving complex optimization challenges within this domain. 展开更多
关键词 Multi-objective optimization planetary gearbox gear efficiency sailfish optimization differential evolution hybrid algorithms
Enhanced Multi-Object Dwarf Mongoose Algorithm for Optimization Stochastic Data Fusion Wireless Sensor Network Deployment
作者 Shumin Li Qifang Luo Yongquan Zhou 《Computer Modeling in Engineering & Sciences》 2025年第2期1955-1994,共40页
Wireless sensor network deployment optimization is a classic NP-hard problem and a popular topic in academic research.However,the current research on wireless sensor network deployment problems uses overly simplistic ... Wireless sensor network deployment optimization is a classic NP-hard problem and a popular topic in academic research.However,the current research on wireless sensor network deployment problems uses overly simplistic models,and there is a significant gap between the research results and actual wireless sensor networks.Some scholars have now modeled data fusion networks to make them more suitable for practical applications.This paper will explore the deployment problem of a stochastic data fusion wireless sensor network(SDFWSN),a model that reflects the randomness of environmental monitoring and uses data fusion techniques widely used in actual sensor networks for information collection.The deployment problem of SDFWSN is modeled as a multi-objective optimization problem.The network life cycle,spatiotemporal coverage,detection rate,and false alarm rate of SDFWSN are used as optimization objectives to optimize the deployment of network nodes.This paper proposes an enhanced multi-objective mongoose optimization algorithm(EMODMOA)to solve the deployment problem of SDFWSN.First,to overcome the shortcomings of the DMOA algorithm,such as its low convergence and tendency to get stuck in a local optimum,an encircling and hunting strategy is introduced into the original algorithm to propose the EDMOA algorithm.The EDMOA algorithm is designed as the EMODMOA algorithm by selecting reference points using the K-Nearest Neighbor(KNN)algorithm.To verify the effectiveness of the proposed algorithm,the EMODMOA algorithm was tested at CEC 2020 and achieved good results.In the SDFWSN deployment problem,the algorithm was compared with the Non-dominated Sorting Genetic Algorithm II(NSGAII),Multiple Objective Particle Swarm Optimization(MOPSO),Multi-Objective Evolutionary Algorithm based on Decomposition(MOEA/D),and Multi-Objective Grey Wolf Optimizer(MOGWO).By comparing and analyzing the performance evaluation metrics and optimization results of the objective functions of the multi-objective algorithms,the algorithm outperforms the other algorithms in the SDFWSN deployment results.To better demonstrate the superiority of the algorithm,simulations of diverse test cases were also performed,and good results were obtained. 展开更多
关键词 Stochastic data fusion wireless sensor networks network deployment spatiotemporal coverage dwarf mongoose optimization algorithm multi-objective optimization
一种高效的K-medoids聚类算法 被引量:47
作者 夏宁霞 苏一丹 覃希 《计算机应用研究》 CSCD 北大核心 2010年第12期4517-4519,共3页
针对K-medoids算法初始中心点选择敏感、大数据集聚类应用中性能低下等缺点,提出一个基于初始中心微调与增量中心候选集的改进K-medoids算法。新算法以微调方式优化初始中心,以中心候选集逐步扩展的方式来降低中心轮换的计算复杂性。实... 针对K-medoids算法初始中心点选择敏感、大数据集聚类应用中性能低下等缺点,提出一个基于初始中心微调与增量中心候选集的改进K-medoids算法。新算法以微调方式优化初始中心,以中心候选集逐步扩展的方式来降低中心轮换的计算复杂性。实验结果表明,相对于传统的K-medoids算法,新算法可以提高聚类质量,有效缩短计算时间。 展开更多
关键词 聚类 k-medoids算法 中心微调 增量候选
基于距离不等式的K-medoids聚类算法 被引量:15
作者 余冬华 郭茂祖 +3 位作者 刘扬 任世军 刘晓燕 刘国军 《软件学报》 EI CSCD 北大核心 2017年第12期3115-3128,共14页
研究加速K-medoids聚类算法,首先以PAM(partitioning around medoids)、TPAM(triangular inequality elimination criteria PAM)算法为基础给出两个加速引理,并基于中心点之间距离不等式提出两个新加速定理.同时,以O(n+K^2)额外内存空... 研究加速K-medoids聚类算法,首先以PAM(partitioning around medoids)、TPAM(triangular inequality elimination criteria PAM)算法为基础给出两个加速引理,并基于中心点之间距离不等式提出两个新加速定理.同时,以O(n+K^2)额外内存空间开销辅助引理、定理的结合而提出加速SPAM(speed up PAM)聚类算法,使得K-medoids聚类算法复杂度由O(K(n-K)~2)降低至O((n-K)~2).在实际及人工模拟数据集上的实验结果表明:相对于PAM,TPAM,FKMEDOIDS(fast K-medoids)等参考算法均有改进,运行时间比PAM至少提升0.828倍. 展开更多
关键词 数据挖掘 聚类算法 k-medoids 距离不等式
基于粒计算的K-medoids聚类算法 被引量:39
作者 马箐 谢娟英 《计算机应用》 CSCD 北大核心 2012年第7期1973-1977,共5页
传统K-medoids聚类算法的聚类结果随初始中心点不同而波动,且计算复杂度较高不适于处理大规模数据集;快速K-medoids聚类算法通过选择合适的初始聚类中心改进了传统K-medoids聚类算法,但是快速K-medoids聚类算法的初始聚类中心有可能位... 传统K-medoids聚类算法的聚类结果随初始中心点不同而波动,且计算复杂度较高不适于处理大规模数据集;快速K-medoids聚类算法通过选择合适的初始聚类中心改进了传统K-medoids聚类算法,但是快速K-medoids聚类算法的初始聚类中心有可能位于同一类簇。为克服传统K-medoids聚类算法和快速K-medoids聚类算法的缺陷,提出一种基于粒计算的K-medoids聚类算法。算法引入粒度概念,定义新的样本相似度函数,基于等价关系产生粒子,根据粒子包含样本多少定义粒子密度,选择密度较大的前K个粒子的中心样本点作为K-medoids聚类算法的初始聚类中心,实现K-medoids聚类。UCI机器学习数据库数据集以及随机生成的人工模拟数据集实验测试,证明了基于粒计算的K-medoids聚类算法能得到更好的初始聚类中心,聚类准确率和聚类误差平方和优于传统K-medoids和快速K-medoids聚类算法,具有更稳定的聚类结果,且适用于大规模数据集。 展开更多
关键词 传统k-medoids聚类算法 快速k-medoids聚类算法 粒计算 等价关系 聚类
基于多核平台并行K-Medoids算法研究 被引量:9
作者 李静滨 杨柳 华蓓 《计算机应用研究》 CSCD 北大核心 2011年第2期498-500,共3页
关键词 多核 k-medoids算法 并行算法 OPENMP
