Accurate influent flow rate prediction is important for operators and managers at wastewater treatment plants(WWTPs),as it is closely related to wastewater characteristics such as biochemical oxygen demand(BOD),total ...Accurate influent flow rate prediction is important for operators and managers at wastewater treatment plants(WWTPs),as it is closely related to wastewater characteristics such as biochemical oxygen demand(BOD),total suspend solids(TSS),and pH.Previous studies have been conducted to predict influent flow rate,and it was proved that data-driven models are effective tools.However,most of these studies have focused on batch learning,which is inadequate for wastewater prediction in the era of COVID-19 as the influent pattern changed significantly.Online learning,which has distinct advantages of dealing with stream data,large data set,and changing data pattern,has a potential to address this issue.In this study,the performance of conventional batch learning models Random Forest(RF),K-Nearest Neighbors(KNN),and Multi-Layer Perceptron(MLP),and their respective online learning models Adaptive Random Forest(aRF),Adaptive K-Nearest Neighbors(aKNN),and Adaptive Multi-Layer Perceptron(aMLP),were compared for predicting influent flow rate at two Canadian WWTPs.Online learning models achieved the highest R2,the lowest MAPE,and the lowest RMSE compared to conventional batch learning models in all scenarios.The R2 values on testing data set for 24-h ahead prediction of the aRF,aKNN,and aMLP at Plant A were 0.90,0.73,and 0.87,respectively;these values at Plant B were 0.75,0.78,and 0.56,respectively.The proposed online learning models are effective in making reliable predictions under changing data patterns,and they are efficient in dealing with continuous and large influent data streams.They can be used to provide robust decision support for wastewater treatment and management in the changing era of COVID-19 and also under other unprecedented emergencies that could change influent patterns.展开更多
Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks, In contrast with traditional single-label lear...Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks, In contrast with traditional single-label learning, the cost of la- beling a multi-label example is rather high, thus it becomes an important task to train an effective multi-label learning model with as few labeled examples as possible. Active learning, which actively selects the most valuable data to query their labels, is the most important approach to reduce labeling cost. In this paper, we propose a novel approach MADM for batch mode multi-label active learning. On one hand, MADM exploits representativeness and diversity in both the feature and label space by matching the distribution between labeled and unlabeled data. On the other hand, it tends to query predicted positive instances, which are expected to be more informative than negative ones. Experiments on benchmark datasets demonstrate that the proposed approach can reduce the labeling cost significantly.展开更多
文摘Accurate influent flow rate prediction is important for operators and managers at wastewater treatment plants(WWTPs),as it is closely related to wastewater characteristics such as biochemical oxygen demand(BOD),total suspend solids(TSS),and pH.Previous studies have been conducted to predict influent flow rate,and it was proved that data-driven models are effective tools.However,most of these studies have focused on batch learning,which is inadequate for wastewater prediction in the era of COVID-19 as the influent pattern changed significantly.Online learning,which has distinct advantages of dealing with stream data,large data set,and changing data pattern,has a potential to address this issue.In this study,the performance of conventional batch learning models Random Forest(RF),K-Nearest Neighbors(KNN),and Multi-Layer Perceptron(MLP),and their respective online learning models Adaptive Random Forest(aRF),Adaptive K-Nearest Neighbors(aKNN),and Adaptive Multi-Layer Perceptron(aMLP),were compared for predicting influent flow rate at two Canadian WWTPs.Online learning models achieved the highest R2,the lowest MAPE,and the lowest RMSE compared to conventional batch learning models in all scenarios.The R2 values on testing data set for 24-h ahead prediction of the aRF,aKNN,and aMLP at Plant A were 0.90,0.73,and 0.87,respectively;these values at Plant B were 0.75,0.78,and 0.56,respectively.The proposed online learning models are effective in making reliable predictions under changing data patterns,and they are efficient in dealing with continuous and large influent data streams.They can be used to provide robust decision support for wastewater treatment and management in the changing era of COVID-19 and also under other unprecedented emergencies that could change influent patterns.
文摘Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks, In contrast with traditional single-label learning, the cost of la- beling a multi-label example is rather high, thus it becomes an important task to train an effective multi-label learning model with as few labeled examples as possible. Active learning, which actively selects the most valuable data to query their labels, is the most important approach to reduce labeling cost. In this paper, we propose a novel approach MADM for batch mode multi-label active learning. On one hand, MADM exploits representativeness and diversity in both the feature and label space by matching the distribution between labeled and unlabeled data. On the other hand, it tends to query predicted positive instances, which are expected to be more informative than negative ones. Experiments on benchmark datasets demonstrate that the proposed approach can reduce the labeling cost significantly.