Most studies have conducted experiments on predicting energy consumption by integrating data formodel training.However, the process of centralizing data can cause problems of data leakage.Meanwhile,many laws and regul...Most studies have conducted experiments on predicting energy consumption by integrating data formodel training.However, the process of centralizing data can cause problems of data leakage.Meanwhile,many laws and regulationson data security and privacy have been enacted, making it difficult to centralize data, which can lead to a datasilo problem. Thus, to train the model while maintaining user privacy, we adopt a federated learning framework.However, in all classical federated learning frameworks secure aggregation, the Federated Averaging (FedAvg)method is used to directly weight the model parameters on average, which may have an adverse effect on te model.Therefore, we propose the Federated Reinforcement Learning (FedRL) model, which consists of multiple userscollaboratively training the model. Each household trains a local model on local data. These local data neverleave the local area, and only the encrypted parameters are uploaded to the central server to participate in thesecure aggregation of the global model. We improve FedAvg by incorporating a Q-learning algorithm to assignweights to each locally uploaded local model. And the model has improved predictive performance. We validatethe performance of the FedRL model by testing it on a real-world dataset and compare the experimental results withother models. The performance of our proposed method in most of the evaluation metrics is improved comparedto both the centralized and distributed models.展开更多
The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep r...The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments.展开更多
Aiming at optimizing the energy consumption of HVAC,an energy conservation optimization method was proposed for HVAC systems based on the sensitivity analysis(SA),named the sensitivity analysis combination method(SAC)...Aiming at optimizing the energy consumption of HVAC,an energy conservation optimization method was proposed for HVAC systems based on the sensitivity analysis(SA),named the sensitivity analysis combination method(SAC).Based on the SA,neural network and the related settings about energy conservation of HVAC systems,such as cooling water temperature,chilled water temperature and supply air temperature,were optimized.Moreover,based on the data of the existing HVAC system,various optimal control methods ofHVAC systems were tested and evaluated by a simulated HVAC system in TRNSYS.The results show that the proposed SA combination method can reduce significant computational load while maintaining an equivalent energy performance compared with traditional optimal control methods.展开更多
Meta-learning has been widely applied to solving few-shot reinforcement learning problems,where we hope to obtain an agent that can learn quickly in a new task.However,these algorithms often ignore some isolated tasks...Meta-learning has been widely applied to solving few-shot reinforcement learning problems,where we hope to obtain an agent that can learn quickly in a new task.However,these algorithms often ignore some isolated tasks in pursuit of the average performance,which may result in negative adaptation in these isolated tasks,and they usually need sufficient learning in a stationary task distribution.In this paper,our algorithm presents a hierarchical framework of double meta-learning,and the whole framework includes classification,meta-learning,and re-adaptation.Firstly,in the classification process,we classify tasks into several task subsets,considered as some categories of tasks,by learned parameters of each task,which can separate out some isolated tasks thereafter.Secondly,in the meta-learning process,we learn category parameters in all subsets via meta-learning.Simultaneously,based on the gradient of each category parameter in each subset,we use meta-learning again to learn a new metaparameter related to the whole task set,which can be used as an initial parameter for the new task.Finally,in the re-adaption process,we adapt the parameter of the new task with two steps,by the meta-parameter and the appropriate category parameter successively.Experimentally,we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set.Additionally,our algorithm presents a more rapid adaptation process within readaptation.Moreover,we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.展开更多
基金supported by National Key R&D Program of China(No.2020YFC2006602)National Natural Science Foundation of China(Nos.62172324,62072324,61876217,6187612)+2 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(No.BE2020026)Natural Science Foundation of Jiangsu Province(No.BK20190942).
文摘Most studies have conducted experiments on predicting energy consumption by integrating data formodel training.However, the process of centralizing data can cause problems of data leakage.Meanwhile,many laws and regulationson data security and privacy have been enacted, making it difficult to centralize data, which can lead to a datasilo problem. Thus, to train the model while maintaining user privacy, we adopt a federated learning framework.However, in all classical federated learning frameworks secure aggregation, the Federated Averaging (FedAvg)method is used to directly weight the model parameters on average, which may have an adverse effect on te model.Therefore, we propose the Federated Reinforcement Learning (FedRL) model, which consists of multiple userscollaboratively training the model. Each household trains a local model on local data. These local data neverleave the local area, and only the encrypted parameters are uploaded to the central server to participate in thesecure aggregation of the global model. We improve FedAvg by incorporating a Q-learning algorithm to assignweights to each locally uploaded local model. And the model has improved predictive performance. We validatethe performance of the FedRL model by testing it on a real-world dataset and compare the experimental results withother models. The performance of our proposed method in most of the evaluation metrics is improved comparedto both the centralized and distributed models.
基金supported by Primary Research and Development Plan of China(No.2020YFC2006602)National Natural Science Foundation of China(Nos.62072324,61876217,61876121,61772357)+2 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(No.BE2020026)Natural Science Foundation of Jiangsu Province(No.BK20190942).
文摘The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments.
基金supported by National Key R&D Program of China(No.2020YFC2006602)National Natural Science Foundation of China(Nos.62072324,61876217,61876121,61772357)+1 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(No.BE2020026).
文摘Aiming at optimizing the energy consumption of HVAC,an energy conservation optimization method was proposed for HVAC systems based on the sensitivity analysis(SA),named the sensitivity analysis combination method(SAC).Based on the SA,neural network and the related settings about energy conservation of HVAC systems,such as cooling water temperature,chilled water temperature and supply air temperature,were optimized.Moreover,based on the data of the existing HVAC system,various optimal control methods ofHVAC systems were tested and evaluated by a simulated HVAC system in TRNSYS.The results show that the proposed SA combination method can reduce significant computational load while maintaining an equivalent energy performance compared with traditional optimal control methods.
基金financially supported by the National Key R&D Program of China(2020YFC2006602)the National Natural Science Foundation of China(Grant Nos.62072324,61876217,61876121,61772357)+3 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(BE2020026)Natural ScienceFoundationof Jiangsu Province(BK20190942)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.KYCX21_3020).
文摘Meta-learning has been widely applied to solving few-shot reinforcement learning problems,where we hope to obtain an agent that can learn quickly in a new task.However,these algorithms often ignore some isolated tasks in pursuit of the average performance,which may result in negative adaptation in these isolated tasks,and they usually need sufficient learning in a stationary task distribution.In this paper,our algorithm presents a hierarchical framework of double meta-learning,and the whole framework includes classification,meta-learning,and re-adaptation.Firstly,in the classification process,we classify tasks into several task subsets,considered as some categories of tasks,by learned parameters of each task,which can separate out some isolated tasks thereafter.Secondly,in the meta-learning process,we learn category parameters in all subsets via meta-learning.Simultaneously,based on the gradient of each category parameter in each subset,we use meta-learning again to learn a new metaparameter related to the whole task set,which can be used as an initial parameter for the new task.Finally,in the re-adaption process,we adapt the parameter of the new task with two steps,by the meta-parameter and the appropriate category parameter successively.Experimentally,we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set.Additionally,our algorithm presents a more rapid adaptation process within readaptation.Moreover,we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.