摘要
为解决大型水环境数据库存在的水环境数据缺失问题,针对不同水环境数据特征及实际应用情景提出了最优混合替代法(OMI)用于缺失数据的填充,即根据数据样本类型、是否正态分布等特征,选择相对应的最佳数据填充方式,给出了OMI算法思路,并应用归一化均方误差评估几种缺失值填充算法预测藻类频率的精确性,进而验证OMI算法的有效性。结果表明OMI填充效果更加显著。
In order to solve the problem of missing values of large water-quality databases,according to the different characteristics of water environment data and practical application scenarios,the optimal mixed imputation(OMI)algorithm and processes was proposed to fill the missing data.And the best data filling method was selected according to the characteristics of data sample type,normal distribution.The implementation of OMI was given.The normalized mean square error was used to evaluate the accuracy of algae frequency,which is predicted by several imputation algorithms.The results show that the effect of OMI filling is more significant.
作者
夏立玲
朱跃龙
XIA Li-ling;ZHU Yue-long(College of Computer & Software, NanjingVocational Institute of Industry Technology, Nanjing, 210023, China;College of Computer and Information Engineering, Hohai University, Nanjing 210098, China)
出处
《水电能源科学》
北大核心
2018年第4期158-161,85,共5页
Water Resources and Power
基金
国家自然科学青年基金项目(51509129)
中国高等职业技术教育研究会课题(GZYLX1213154)
江苏省智能传感网工程技术研究开发中心开放基金项目(ZK13-02-05)
南京工业职业技术学院重大项目(YK14-04-02)
2017年江苏省高校优秀科技创新团队(工业大数据应用技术)(902050617TD003)