摘要
油浸式变压器的油色谱数据是一种多元时序传感数据,设备或网络失误往往会导致数据缺失,通常需要通过插补形成完整数据集,才能用于进一步的业务分析研究。但是,现有的插补模型无法面向多元时序数据同时处理因时间不均匀性和时间双向性带来的插补效率低和效果难以保障的问题,对此提出一种名为Conv-WGAIN的生成对抗插补网络模型,通过构建的插补特征图,可利用二维卷积从前后2个方向学习时间特征,处理时间间隔不均匀的数据;在判别器中引入Wasserstein距离来判别生成插补数据与真实观测数据,提升了生成器的稳定性。在真实项目中的油色谱数据集和3个公开数据集上的实验表明,该模型在多元时序缺失数据上具有普遍适用性,而且在不同的缺失率下的插补结果要优于其他对比模型的,RMSE降低了20.75%~73.37%。
Gas chromatography data of oil-immersed transformers is a kind of multivariate time series,but such data is often missing due to equipment or network failures.Imputation is usually required to form a complete dataset for further business analysis and research.However,the existing imputation models cannot deal with multivariate time series data conveniently to guarantee the efficiency and effect from the inherent characteristics of temporal irregularity and temporal bidirectionality.In this paper,a model Conv-WGAIN is proposed based on the Generative Adversarial Imputation Nets(GAIN).Through the constructed imputation feature map,2D convolution can be used to learn temporal bidirectional features and simultaneously deal with irregular time intervals.The Wasserstein distance is introduced in discriminator for judgement to improve the stability of the model.Experiments on gas chromatography datasets from a real project and 3 public datasets show that our work is universal for data imputation on multivariate time series missing,and Conv-WGAIN outperforms other baselines with 20.75%to 73.37%in metric RMSE.
作者
刘子建
丁维龙
邢梦达
李寒
黄晔
LIU Zi-jian;DING Wei-long;XING Meng-da;LI Han;HUANG Ye(School of Information Science and Technology,North China University of Technology,Beijing 100144;Beijing Key Laboratory on Integration and Analysis of Large-Scale Stream Data,Beijing 100144;Information Center of the National Defense Mobilization Department of the Central Military Commission,Beijing 100034,China)
出处
《计算机工程与科学》
CSCD
北大核心
2023年第5期931-939,共9页
Computer Engineering & Science
基金
北京市自然科学基金(4202021)。
关键词
生成对抗插补网络
多元时序数据
卷积神经网络
Wasserstein距离
缺失值插补
generative adversarial imputation nets
multivariate time series data
convolutional neural network
Wasserstein distance
missing value imputation