期刊文献+

一种基于深度学习的供热策略优化方法 被引量:2

Heating Strategy Optimization Method Based on Deep Learning
下载PDF
导出
摘要 在中国北方,冬季楼宇集中供暖采用的策略通常为气候补偿器,但是该策略严重依赖人工经验,调节相对粗放,如何优化供热控制策略对于保持楼宇室温的稳定舒适十分重要。对此,提出了一种基于深度学习的供热策略优化方法,通过学习历史真实数据信息从而对原始控制策略进行优化。首先以学习室内温度变化的热力学规律为目标,提出了一种深度多时差分网络MTDN(Multiple Time Difference Network)来对下一时刻的室温进行预测,该网络不仅准确率高,而且符合物理规律;然后将MTDN当成模拟器,以表征人体热反应的评价指标作为相关奖励项,使用基于最大熵强化学习思想的SAC(Soft Actor Critic)算法作为策略优化器与之交互训练,从而学习到一个稳定优秀的供热控制策略;最后基于天津某个换热站的真实数据,设计相关实验分别对模拟器预测能力和策略优化器策略控制能力进行评估。验证得出:相比其他类型的预测模拟器,该模拟器不仅预测精度高,并且符合物理规律;同时,相比原始策略,该策略优化器所学的策略在随机采样的多个时段内均可以保证室内温度更加稳定舒适。 Typically,the strategy of central heating for buildings in winter is climate compensator.However,this strategy heavily relies on manual experience with a relatively simple regulation.Therefore,how to optimize the heating control strategy is very important to keep the indoor temperature stable and comfortable.For this task,this paper proposes a heating strategy optimization method based on deep learning and deep reinforcement learning,which can optimize the original control strategy based on real historical data.The paper first develops a deep MTDN(Multiple Time Difference Network)as the simulator to predict the next time slot’s room temperature.By learning the thermodynamic law of indoor temperature change,the network has high accuracy and confirms the physical laws.After that,the SAC(Soft Actor-Critic)algorithm based on maximum entropy reinforcement learning is employed as the strategy optimizer to interact with the simulator.Here,we use the evaluation index of the human body’s thermal response as the reward to train and optimize the heating control strategy.Based on the real data of a heat exchange station in Tianjin,we evaluate the predictive ability of the simulator and the control ability of the strategy optimizer,respectively.The results verify that,compared with other types of prediction simulators,this simulator not only has high prediction accuracy but also conforms to physical laws.At the same time,compared with the original strategy,the strategy learned by the strategy optimizer can ensure that the indoor temperature is more stable and comfortable in multiple time periods of random sampling.
作者 李鹏 易修文 齐德康 段哲文 李天瑞 LI Peng;YI Xiu-wen;QI De-kang;DUAN Zhe-wen;LI Tian-rui(School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China;JD Intelligent Cities Research,Beijing 100176,China;School of Computer Science and Technology,Xidian University,Xi’an 710071,China)
出处 《计算机科学》 CSCD 北大核心 2022年第4期263-268,共6页 Computer Science
基金 国家重点研发计划(2019YFB2101801) 国家自然科学基金面上项目(61773324)。
关键词 集中供暖 供热优化 深度学习 深度强化学习 城市计算 Central heating Heating optimization Deep learning Deep reinforcement learning Urban computing
  • 相关文献

参考文献3

二级参考文献10

共引文献10

同被引文献10

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部