期刊文献+

工业过程数据中缺失值处理方法的研究 被引量:13

Research on approach of missing data in industrial process data
下载PDF
导出
摘要 针对工业生产中过程数据的缺失问题,首次提出了运用多重填补方法处理工业过程的缺失数据。阐述了常用的缺失数据处理方法,指出各方法的优缺点。在此基础上,通过建立回归模型,针对多变量工业数据中缺失值较少和较多时的两种情况,分别用删除含缺失值的个案、简单填补和多重填补(MI)3种方法对数据进行处理,利用处理后的新数据集进行数据挖掘,预测目标变量的值,并对预测结果进行分析比较。实验结果表明,多重填补方法的处理效果最好,为工业数据的缺失值处理提供了有用的策略。 Aimed at the problem of missing data in the process of Industrial production, the use of multiple imputation(MI) to treat the missing data in the industrial process is presented at the first time.Firstly, the commonly used method in the treatment of missing data is described and the advantages and disadvantages of each method are pointed out.Then, by establish the regression model, the multi-variable industrial data sets with larger and lower missing rates were treated by deleting the cases with missing data, simple imputation and multiple imputation.Then, made the new data sets for data mining and predicted the value of the target variables to compare and analyze the results.The results show that MI makes the best effect in the process of data.It provides a useful strategy for dealing with data sets with missing values in industrial data.
作者 郭超 陆新建
出处 《计算机工程与设计》 CSCD 北大核心 2010年第6期1351-1354,共4页 Computer Engineering and Design
关键词 缺失值 多重填补法 工业过程数据 数据挖掘 回归预测 missing data multiple imputation industrial process data data mining regression prediction
  • 相关文献

参考文献17

二级参考文献71

  • 1刘鹏,雷蕾,张雪凤.缺失数据处理方法的比较研究[J].计算机科学,2004,31(10):155-156. 被引量:24
  • 2毕方明,张永平.数据挖掘技术研究[J].计算机工程与设计,2004,25(12):2242-2244. 被引量:28
  • 3谢川,倪世宏,张宗麟.一种缺失飞行参数预处理的新方法[J].计算机仿真,2005,22(4):27-31. 被引量:9
  • 4邹志文,朱金伟.数据挖掘算法研究与综述[J].计算机工程与设计,2005,26(9):2304-2307. 被引量:52
  • 5Wang N, Robins J M . Large - sample theory for parametric multiple imputation procedures[J]. Biometrika, 1998.85(4) :935-948.
  • 6James H, Anne J, Gary K, et al. Amelia: A Program for Missing Data[S]. Department of Government, Harvard University, 1999.2-15.
  • 7Aidan M. Multiple imputation for missing data using the “Solas for the missing data analysis” software application [ S]. Conference of European Statisticians. 1999. 2-8.
  • 8Nicholas J H. Stuart R L. Multiple imputation in Practices: comparison of software packages for the regression models with missing variables[J]. The American Statistician. 2001. 55(3) :244-254.
  • 9Yang C Y. Multiple imputation for missing data: concepts and new development[S]. SAS Institute Inc. 1999. 4-9.
  • 10Barnard J, Meng X L. Applications of multiple imputation in medical study: from ADIS to NHANES[J]. Statistical Methods in Medical Research, 1999,8(1): 17-36.

共引文献159

同被引文献157

引证文献13

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部