期刊文献+

基于Hadoop的Canopy-K-means并行算法的学生成绩与毕业流向关系分析 被引量:11

Analysis of student score and graduation destination based on Hadoop's Canopy-K-means parallel algorithm
下载PDF
导出
摘要 为了探究学生成绩与其毕业去向之间存在的内在关系,提出基于Hadoop的Canopy-Kmeans并行算法并进行分析.首先基于"最小最大原则"确定Canopy的初始中心点并快速粗糙聚类,将其作为K-means算法的初始聚类中心,并基于MapReduce计算框架实现其并行化.然后以西安工程大学2017届毕业生的教务数据为基础,进行海量教务数据的挖掘分析实验,完成相同毕业流向类型学生的聚类,同时分析各毕业流向与课程之间的内在联系.实验结果证明,改进后的Canopy-K-means算法在处理海量数据时,相比传统K-means算法,聚类收敛速度提高约2.1倍,准确率提高约15%,具有良好的聚类效果. In order to explore the intrinsic relationship between student grades and graduation destination,Canopy-K-means parallel algorithm based on Hadoop was used for analysis.Firstly,based on the“minimum and maximum principle”,the initial center point of Canopy was determined,clustering fastly.K-means algorithm uses it as the initial clustering center,and achieves parallelization based on MapReduce.Then mining analysis experiment was conducted with the educational data of the2017graduates of Xi′an Polytechnic University,clustering the students with the same graduation type,and get the result of the internal relationship between graduation types and courses.The experimental results show that when processing massive data,compared with the traditional K-means algorithm,Canopy-K-means algorithm improves the cluster convergence speed by about2.1times,and increases the accuracy rate by around15percentage points,which has better clustering effect.
作者 郭卫霞 薛涛 李婷 GUO Weixia;XUE Tao;LI Ting(School of Computer Science, Xi′an Polytechnic University, Xi′an 710048, China)
出处 《西安工程大学学报》 CAS 2018年第6期705-712,共8页 Journal of Xi’an Polytechnic University
基金 陕西省自然科学基础计划一般项目(2018JQ6103)
关键词 HADOOP Canopy-K-means 最小最大原则 MAPREDUCE 教务 毕业流向 Hadoop Canopy-K-means minimum and maximum principle MapReduce educational administration graduation flow
  • 相关文献

参考文献17

二级参考文献131

共引文献1288

同被引文献102

引证文献11

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部