Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to s...Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to shallow search depths and suffered from significant variability in run time across probleminstances with varying complexity. To mitigate these issues, we extend this methodology to more advancedsearch algorithms based on A^(*) search. First, we develop a problem-specific heuristic based on priority list unitcommitment methods and apply this in Guided A^(*) search, reducing run time by up to 94% with negligibleimpact on operating costs. In addition, we address the run time variability issue by employing a novel anytimealgorithm, Guided IDA^(*), replacing the fixed search depth parameter with a time budget constraint. We showthat Guided IDA^(*) mitigates the run time variability of previous guided tree search algorithms and enablesfurther operating cost reductions of up to 1%.展开更多
基金supported by an Engineering and Physical Sciences Research Council research studentship(grant number:EP/R512400/1).
文摘Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to shallow search depths and suffered from significant variability in run time across probleminstances with varying complexity. To mitigate these issues, we extend this methodology to more advancedsearch algorithms based on A^(*) search. First, we develop a problem-specific heuristic based on priority list unitcommitment methods and apply this in Guided A^(*) search, reducing run time by up to 94% with negligibleimpact on operating costs. In addition, we address the run time variability issue by employing a novel anytimealgorithm, Guided IDA^(*), replacing the fixed search depth parameter with a time budget constraint. We showthat Guided IDA^(*) mitigates the run time variability of previous guided tree search algorithms and enablesfurther operating cost reductions of up to 1%.