摘要
针对电网负荷数据存在冗余和价值密度低等问题,本文提出一种结合K-means算法与Pearson相关系数计算的集成学习方法,对负荷数据进行清洗与去重。设置仿真实验将某地区连续730日的负荷数据进行聚类、切片、排序、比对、去重等分析处理,得到清洗后的新数据集,将新数据集与原数据集代入相同的BP神经网络模型和随机森林模型进行负荷预测,实验结果表明新旧数据集具有相似的特征特性与数据挖掘潜力。与传统的数据去重方法相比,本文提出的数据清洗策略在进行训练集的预处理时,效率和准确度方面均有更好表现,可以为训练用于负荷预测的网络模型提供支持。
Aiming at the problems of redundancy and low value density of power grid load data,this paper proposes an integrated learning method combining K-means algorithm and Pearson correlation coefficient calculation to clean and de duplicate load data.A simulation experiment was set up to cluster,slice,sort,compare and de duplicate the 730 consecutive days'load data of a region,and a new data set was obtained after cleaning.The new data set and the original data set were substituted into the same BP neural network model and random forest model for load forecasting.The experimental results show that the new and old data sets have similar characteristics and data mining potential.Compared with the traditional data de duplication methods,the data cleaning strategy proposed in this paper improves the efficiency and accuracy when preprocessing the training set,and provide support for the training network model used for load forecasting.
作者
赵耀
虞莉娟
苏义鑫
郑拓
童光波
Zhao Yao;Yu Lijuan;Su Yixin;Zheng Tuo;Tong Guangbo(School of Automation,Wuhan University of Technology,Wuhan 430070,China;Hubei Electric Power Company Huanggang Power Supply Company,Huanggang 438000,Wuhan,China)
出处
《船电技术》
2023年第6期69-75,共7页
Marine Electric & Electronic Engineering