摘要
社团发现算法存在生成结果冗余及时间复杂度高等问题,虽然关联规则是解决社团发现问题的有效方法,但面临大量迭代计算的瓶颈。针对上述问题进行了研究,提出了一种改进社团发现的SIACD算法。该算法引入MAC地址和布尔矩阵的概念对数据进行预处理,利用基于项数的布尔向量交运算改进Apriori算法,再基于Spark实现算法并行化计算,通过关联规则的方式挖掘无线社团数据。实验结果表明,SIACD算法解决了生成结果冗余、复杂度高、迭代计算等问题,提升了社团发现的挖掘速度,提高了对大数据的处理能力。
Community discovery algorithm has the problems such as redundant generated results and high time complexity. Association rules are effective methods to solve community discovery problems, but confronted the bottleneck of mass iterative calculation. In order to study the above problems, this paper proposes the SIACD (Spark-based use of improved Apriori to achieve community detection) algorithm for improving community discovery. The algorithm introduces MAC (media access control) address and Boolean matrix concept to preprocess the data,uses the item number-based Boolean vector intersection operation to improve Apriori algorithm, then realizes parallel calculation based on Spark, and mines the wireless community data by association rules. Experimental results show that the SIACD algorithm solves redundant generated results, high complexity, and iterative calculations problems, and improves community discovery's mining speed and the ability to handle big data.
作者
王永贵
徐山珊
肖成龙
WANG Yonggui;XU Shanshan;XIAO Chenglong(College of Software,Liaoning Technical University,Huludao,Liaoning 125105,China)
出处
《计算机科学与探索》
CSCD
北大核心
2019年第9期1582-1592,共11页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学青年基金项目~~