摘要
随着电力信息通信设备资产管理中不断应用的新技术以及泛在电力物联网的逐步推进,在电力信息通信资产管理与运行的各个环节会产生大量数据,在数据的采集中由于传感器的故障或数据传输误码等原因,造成数据集中出现数据的异常,这对后续数据的分析造成了很大的困扰。为了减小数据异常对于后续数据分析造成的影响,筛选恢复异常的数据,本文提出了一种基于栈式降噪自编码器的数据清洗算法(Data cleaning algorithm based on SDAE,DCbS),采用栈式自编码器为电力信息通信资产进行数据清洗,并且考虑到相同的资产数据存在着一定的时间相关性,相比于普通的自编码器,本文通过滑动时间窗口保存数据之间的短时相关性,提高了数据恢复的精度与异常数据的点识别能力;并且在DCbS算法中增加了含噪数据与无损数据之间的残差分析,采用残差对模型进行训练,提高算法对于数据异常点的辨别和恢复能力并一定程度上提高了算法的效率。最终本文通过与现有算法进行对比,从数据的恢复以及异常值的辨别两方面凸显DCbS算法的优越性。
With the continuous application of new technologies in the asset management of power information and communication equipment and the gradual advancement of the ubiquitous power Internet of things, a large amount of data will be generated in every link of the asset management and operation of power information and communication equipment.In data acquisition, due to sensor failures or data transmission errors, data anomalies occur in the data set, which causes great trouble for subsequent data analysis. In order to reduce the impact of data anomalies on subsequent data analysis and recover anomalous data, a data cleaning algorithm based on SDAE(DCbS) is proposed in this paper. In this paper, stack self-encoder is used to clean the data of power information communication assets. Considering that the same asset data has a certain time correlation, compared with the ordinary self-encoder, this paper saves the short-term correlation between the data through sliding time window, which improves the accuracy of data recovery and the point recognition ability of abnormal data.Residual analysis between noisy data and lossless data is added to DCbS algorithm. Residual training is used to train the model, which improves the ability to distinguish and recover outliers of data and improves the efficiency of the algorithm to a certain extent. Finally, by comparing with existing algorithms, this paper highlights the advantages of DCbS algorithm from two aspects: data recovery and outlier identification.
作者
赵敏
王慧卿
张超
李洋
张建亮
高枫
任学武
ZHAO Min;WANG Hui-qing;ZHANG Chao;LI Yang;ZHANG JIan-liang;GAO Feng;REN Xue-wu(\1.State Grid Shanxi Electric Power Company Information and Communication Branch,Taiyuan 030001,China;Beijing Qianrunhe Technology Co.,Ltd,Beijing 100190,China)
出处
《山东农业大学学报(自然科学版)》
北大核心
2019年第6期1093-1096,共4页
Journal of Shandong Agricultural University:Natural Science Edition
基金
“大云物移”新技术环境下的信息通信资产全寿命管理研究项目(SGSXXT00JFJS1800109)
关键词
信息通信资产
数据清洗
栈式自编码器
残差分析
异常值辨识
Information communication equipment assets
data cleaning
stack denoising auto encoder
residual analysis
outlier identification