摘要
传统蚁群优化聚类算法在处理大规模数据时存在内存不足,不能体现蚁群算法的并行优势,无法处理分布式数据等问题。为此,提出一种并行蚁群优化聚类算法。通过借鉴搜索空间复制和搜索空间分块的思想,解决大数据处理问题,逐行读取信息素和数据,避免当数据规模过大时,将信息素一次性读入而造成内存不足的风险。实验结果表明,该算法在处理大规模数据时具有较好的可扩展性和较高的加速比。
Traditional algorithm has to face a number of problems,such as limiting of memory,lacking of parallel advantage,unable to handle distributed datasets.In order to deal with the problems,this paper proposes a parallel Ant Colony Optimization Clustering(ACOC) algorithm.The proposed algorithm solves the problem of big data by referencing the thought of the search space replication approach and the search space partition approach.The algorithm can read pheromone and dataset line-by-line to avoid out of memory when dealing with large datasets.Experimental results demonstrate that the algorithm has good scalability and high speedup when dealing with large-scale data.
出处
《计算机工程》
CAS
CSCD
北大核心
2015年第8期168-173,共6页
Computer Engineering
基金
国家"973"计划基金资助项目(2013CB329603)
国家自然科学基金资助项目(71071047)
安徽省自然科学基金资助项目(1208085MG120)