期刊文献+

基于改进k-means算法的海量智能用电数据分析 被引量:125

Improved K-Means Algorithm Based Analysis on Massive Data of Intelligent Power Utilization
下载PDF
导出
摘要 针对智能用电数据挖掘面临数据量大、挖掘效率低等难题,进行Map-Reduce模型下基于改进k-means的海量用电数据分析研究。以家庭用户为例,建立了家庭用户用电信息的家庭用户号、房屋面积、家庭成员数、每天用电量、峰谷电量、家用电器数等的数据维度模型,利用k-means算法简单、收敛速度快的优势,克服其容易陷入局部最优解的缺陷,综合考虑初始聚类中心的选择及聚类个数的选取2个因素,以数据对象密度的大小作为初始聚类中心的选取标准,将簇间距离及簇内对象的分散程度作为聚类数目选择的重要参考,对k-means算法进行改进;为提高数据处理效率,进行Map-Reduce处理模型下的海量家庭用户用电数据的并行挖掘。通过在Hadoop集群上进行实验,结果证明提出的算法运行稳定、高效、可行,且具有良好的加速比。 In allusion to such difficulties as huge amount of data and low mining efficiency that the data mining of intelligent power utilization has to be faced with, on the basis of Map-Reduce parallel processing model the improved k-means algorithm based analysis on massive power utilization data is performed. Taking family user as the example, a data dimension model of family user's electricity utilization information including residential user's number, floor area of the housing, number of family member, daily electricity consumption, peak- and valley-electricity consumption, number of electrical home appliances and so on is established. Utilizing the advantage of k-means algorithm such as simple clustering analysis and rapid convergence and overcoming its defect of easily falling into local optimal solution two factors in the selection of initial clustering center and the selection of cluster number are comprehensively considered; regarding the size of object density as the standard of selecting initial clustering center and taking the distance between the clusters and the degree Of dispersion of objects inside the cluster as the important reference for clustering number selection the traditional k-means algorithm is improved; to enhance the data processing efficiency the massive family users' power utilization data is mined under the Map-Reduce parallel processing model. Experiments on the Hadoop cluster are carried out and experimental results show that the proposed algorithm is feasible and its operation is stable and efficient, besides, it possesses good speed-up ratio.
机构地区 重庆市电力公司
出处 《电网技术》 EI CSCD 北大核心 2014年第10期2715-2720,共6页 Power System Technology
关键词 智能用电 云计算 Map-Reduce处理模型 K-MEANS算法 并行挖掘 intelligent power utilization cloud-computing Map-Reduce processing model k-means algorithm parallelmining
  • 相关文献

参考文献14

二级参考文献227

共引文献1624

同被引文献1173

引证文献125

二级引证文献1436

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部