期刊文献+

基于VAEGAN的缺失数据填补研究

Missing Data Imputation Method Based on VAEGAN
下载PDF
导出
摘要 数据的完整性对人工智能、数据挖掘的研究有重要意义,然而在数据从采集到应用的过程中,由于各种原因,经常会存在数据缺失的现象。为减少数据缺失对数据应用带来的影响,提出一种基于变分自编码器生成对抗网络(Variational Autoencoder Generative Adversarial Net-work,VAEGAN)的缺失数据填补模型。模型根据不完整数据集中缺失信息构建缺失掩码,利用缺失掩码在无需完整数据参与的条件下设计重构损失函数和鉴别损失函数,在不完整数据集上采用变分推断的思想生成缺失数据的估计值,利用鉴别器对抗训练生成网络。最后在不同数据集、不同缺失的条件下与常用的缺失填补算法进行对比实验。 Data integrity is of great significance to the research of artificial intelligence and data mining.However,the problem of missing data occurs constantly for various reasons during data acquisition and application.To reduce the adverse impact of missing data,a missing data imputation model based on VAEGAN is proposed.The model constructs a missing mask according to the missing information in the incomplete dataset.The reconstruction loss function and the discriminate loss function is designed using the missing mask,without participation of complete data.The estimated value of the missing data is generated by the idea of variational inference on the incomplete dataset.The model is trained using an adversarial mode.Finally,an experiment of missing data imputation is conducted using different methods on various datasets and different missing rate.
作者 徐晔波 倪颖杰 XU Yebo;NI Yingjie(Information Engineering University,Zhengzhou 450001,China;Jiangnan Institute of Computing Technology,Wuxi 214083,China)
出处 《信息工程大学学报》 2022年第2期224-229,共6页 Journal of Information Engineering University
关键词 缺失数据填补 生成式对抗网络 变分自编码器 missing data imputation GAN VAE
  • 相关文献

参考文献5

二级参考文献23

  • 1王会珍,朱靖波,季铎,叶娜,张斌.基于反馈学习自适应的中文话题追踪[J].中文信息学报,2006,20(3):92-98. 被引量:17
  • 2Cios K,Kurgan L.Trends in data mining and knowledge discovery[M]//Pal N,Jain L,Teoderesku N.Knowledge Discovery in Advanced Information Systems.[S.|.] : Springer, 2002.
  • 3Little R,Rubin D.Statistical analysis with missing data[M].New York:John Wiley & Sons Inc,1987.
  • 4Faris P.Muhiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses[J]. Journal of Clinical Epidemiology, 2002,55 : 184-191.
  • 5Taylor J,Murray S,Hsu C.Survival estimation and testing via multiple imputation[J].Statistics & Probability,2002,58:221-232.
  • 6Zhang S C.Kernel-based multi-imputation for missing data[C]//Accepted in the 4th International Conference on Active Media Technology, 2006
  • 7Little R,Rubin D.Statistical analysis with missing data[M].2nd ed.New York:John Wiley & Sons,1987.
  • 8Barnard J,Rubin D.Small-sample degrees of freedom with multiple imputation[J].Biometrika, 1999,86 : 948-955.
  • 9Schafer J.NORM:multiple imputation of incomplete multivariate data under a normal mode[EB/OL]. ( 1999).http://www.stat.psu.edu/ -jls/misoftwa.html.
  • 10Wang Q,Rao J.Empirical likelihood-based inference in linear models with missing data[J].Canadian J of Statistics,2002,29:563-576.

共引文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部