摘要
针对电网中录波型故障指示器上传的海量故障数据存在着大量的重复、干扰、错误及无效波形,提出一种基于稀疏自编码(sparse auto-encoder,SAE)的故障数据聚类清洗方法,该方法首先利用稀疏自编码对故障数据进行特征学习与降维,继而用主成分分析(principal component analysis,PCA)对降维后数据再次进行降维提取,实现对不同故障数据的特征获取;最后利用基于密度峰快速搜寻聚类算法(clustering by fast search and find of density peaks,CFSFDP)对故障特征进行聚类,实现对重复、干扰、错误等故障数据的聚类清洗和真实故障数据推送。提出的海量故障数据聚类清洗方法,达到了对不同类型故障数据进行清洗去冗的效果,为故障告警智能推送提供了技术支撑,提高了运维人员获取准确故障信息的效率。
There are a lot of repetition,interference,error and invalid waveforms in the massive fault data uploaded to the power grid for the record type fault indicator.A clustering cleaning method of fault data based on sparse self-coding was proposed.This method firstly used sparse auto-encoder to learn and reduce the dimension of fault data,and then used principal component analysis to extract the dimension of fault data after dimension reduction,so as to obtain the characteristics of different fault data.Finally,the clustering of fault features was carried out based on density peak fast searching clustering,and the clustering cleaning of fault data such as repetition,interference and error was realized and the real fault data was pushed.An innovative clustering cleaning method of massive fault data was proposed,which achieves the effect of cleaning and eliminating redundancy of different types of fault data and provides intelligent push of fault alarm.
作者
李立生
刘洋
卢文华
张世栋
张林利
LI Li-sheng;LIU Yang;LU Wen-hua;ZHANG Shi-dong;ZHANG Lin-li(State Grid Shandong Electric Power Research Institute, Jinan 250002, China;State Grid Electric Power Research Institute Wuhan Nanrui, Wuhan 430000, China;Nanjing Nanrui Group, Nanjing 210000, China)
出处
《科学技术与工程》
北大核心
2021年第15期6330-6336,共7页
Science Technology and Engineering
基金
国家电网有限公司总部科技项目(52060019000T)。
关键词
数据清洗
稀疏自编码
主成分分析
聚类分析
data cleansing
sparse auto-encoder(SAE)
principal component analysis(PCA)
clustering analysis