期刊文献+

小样本数据下基于K-Means聚类和集成学习的混凝投药预测

Research on coagulation dosing prediction based on K-Means clustering and ensemble learning under small sample data
原文传递
导出
摘要 为了解决混凝投药预测过程中的小样本问题,提出基于K-Means聚类和集成学习的PAC投加量预测方法。首先,根据原水浊度和水温2个特征采用K-Means聚类将水质分为3类,利用分层抽样从3类水质数据中抽取训练集和测试集;其次,基于Bagging集成学习算法,构建由支持向量机、随机森林、Adaboost、GBDT、Catboost、XGBoost和LightGBM共7种学习器组成的PAC投加量集成预测模型(KM-Bagging);最后,以银川市某给水厂2021—2022年的运行数据为例进行验证。结果表明,KM-Bagging模型对小样本的PAC投加量具有较高预测精度,R^(2)超过0.8,MAPE小于5%。采用6个月和9个月的日监测数据预测PAC投加量,适合数据监测时间短、精度要求不高的情况,预测结果可为原水水质发生突变时的PAC投加量调整提供参考。采用1年的日监测数据预测PAC投加量,预测精度能够满足工程应用的要求,可为水厂实际PAC投加提供辅助指导。研究结果对小样本数据下的混凝药剂投加建模与预测具有参考价值。 A PAC dosage prediction method was proposed to address small sample size issues in coagulant dosage prediction.The method was based on K-Means clustering and ensemble learning.Firstly,Water quality was divided into three categories using K-Means clustering based on raw water turbidity and water temperature.The training and test sets were then extracted from the data using stratified sampling.Secondly,a PAC dosage ensemble prediction model(KM-Bagging) was constructed based on the Bagging ensemble learning algorithm.The model consisted of seven learners:Support Vector Machine,Random Forest,Adaboost,Gradient Boosting Decision Tree,Catboost,XGBoost,and LightGBM.The method was validated using operational data from a water supply plant in Yinchuan City from 2021 to 2022.The results showed that the KM-Bagging model had high prediction accuracy for small sample sizes,with an R^(2) exceeding 0.8 and MAPE less than 5%.When 6-and9-month daily monitoring data were used to predict PAC dosing,the model was suitable for cases where monitoring time was short and high accuracy was not required.The predicted results can be used as a reference for adjusting the PAC dosage when there was a sudden change in raw water quality.When one year of daily monitoring data was used to predict PAC dosing,the prediction accuracy met the requirements for engineering applications and provided auxiliary guidance for actual PAC dosage in water treatment plants.The results of study can provide reference value for modeling coagulant dosage prediction with small sample data.
作者 王世杰 李一鸣 植殷 武仁超 王涛 程紫微 郑磊 肖峰 WANG Shijie;LI Yiming;ZHI Yin;WU Renchao;WANG Tao;CHENG Ziwei;ZHENG Lei;XIAO Feng(College of Water Resources and Hydropower Engineering,North China Electric Power University,Beijing 102206,China;Beijing Global Water Technology Co.,Ltd.,Beijing 100085,China;Ningxia Great Wall Water Co.,Ltd,Yinchuan 750004,China)
出处 《环境工程学报》 CAS CSCD 北大核心 2024年第1期181-188,共8页 Chinese Journal of Environmental Engineering
基金 国家自然科学基金资助项目(52030003)。
关键词 混凝投药量预测 小样本数据 Bagging集成学习 K-MEANS聚类 coagulation dosage prediction small sample data Bagging ensemble learning K-Means clustering
  • 相关文献

参考文献17

二级参考文献185

共引文献912

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部