This paper describes a real-time beam tuning method with an improved asynchronous advantage actor–critic(A3C)algorithm for accelerator systems.The operating parameters of devices are usually inconsistent with the pre...This paper describes a real-time beam tuning method with an improved asynchronous advantage actor–critic(A3C)algorithm for accelerator systems.The operating parameters of devices are usually inconsistent with the predictions of physical designs because of errors in mechanical matching and installation.Therefore,parameter optimization methods such as pointwise scanning,evolutionary algorithms(EAs),and robust conjugate direction search are widely used in beam tuning to compensate for this inconsistency.However,it is difficult for them to deal with a large number of discrete local optima.The A3C algorithm,which has been applied in the automated control field,provides an approach for improving multi-dimensional optimization.The A3C algorithm is introduced and improved for the real-time beam tuning code for accelerators.Experiments in which optimization is achieved by using pointwise scanning,the genetic algorithm(one kind of EAs),and the A3C-algorithm are conducted and compared to optimize the currents of four steering magnets and two solenoids in the low-energy beam transport section(LEBT)of the Xi’an Proton Application Facility.Optimal currents are determined when the highest transmission of a radio frequency quadrupole(RFQ)accelerator downstream of the LEBT is achieved.The optimal work points of the tuned accelerator were obtained with currents of 0 A,0 A,0 A,and 0.1 A,for the four steering magnets,and 107 A and 96 A for the two solenoids.Furthermore,the highest transmission of the RFQ was 91.2%.Meanwhile,the lower time required for the optimization with the A3C algorithm was successfully verified.Optimization with the A3C algorithm consumed 42%and 78%less time than pointwise scanning with random initialization and pre-trained initialization of weights,respectively.展开更多
A new static task scheduling algorithm named edge-zeroing based on dynamic critical paths is proposed. The main ideas of the algorithm are as follows: firstly suppose that all of the tasks are in different clusters; s...A new static task scheduling algorithm named edge-zeroing based on dynamic critical paths is proposed. The main ideas of the algorithm are as follows: firstly suppose that all of the tasks are in different clusters; secondly, select one of the critical paths of the partially clustered directed acyclic graph; thirdly, try to zero one of graph communication edges; fourthly, repeat above three processes until all edges are zeroed; finally, check the generated clusters to see if some of them can be further merged without increasing the parallel time. Comparisons of the previous algorithms with edge-zeroing based on dynamic critical paths show that the new algorithm has not only a low complexity but also a desired performance comparable or even better on average to much higher complexity heuristic algorithms.展开更多
为解决由于固定温度SAC(Soft Actor Critic)算法中存在的Q函数高估可能会导致算法陷入局部最优的问题,通过深入分析提出了一个稳定且受限的SAC算法(SCSAC:Stable Constrained Soft Actor Critic)。该算法通过改进最大熵目标函数修复固...为解决由于固定温度SAC(Soft Actor Critic)算法中存在的Q函数高估可能会导致算法陷入局部最优的问题,通过深入分析提出了一个稳定且受限的SAC算法(SCSAC:Stable Constrained Soft Actor Critic)。该算法通过改进最大熵目标函数修复固定温度SAC算法中的Q函数高估问题,同时增强算法在测试过程中稳定性的效果。最后,在4个OpenAI Gym Mujoco环境下对SCSAC算法进行了验证,实验结果表明,稳定且受限的SAC算法相比固定温度SAC算法可以有效减小Q函数高估出现的次数并能在测试中获得更加稳定的结果。展开更多
文摘This paper describes a real-time beam tuning method with an improved asynchronous advantage actor–critic(A3C)algorithm for accelerator systems.The operating parameters of devices are usually inconsistent with the predictions of physical designs because of errors in mechanical matching and installation.Therefore,parameter optimization methods such as pointwise scanning,evolutionary algorithms(EAs),and robust conjugate direction search are widely used in beam tuning to compensate for this inconsistency.However,it is difficult for them to deal with a large number of discrete local optima.The A3C algorithm,which has been applied in the automated control field,provides an approach for improving multi-dimensional optimization.The A3C algorithm is introduced and improved for the real-time beam tuning code for accelerators.Experiments in which optimization is achieved by using pointwise scanning,the genetic algorithm(one kind of EAs),and the A3C-algorithm are conducted and compared to optimize the currents of four steering magnets and two solenoids in the low-energy beam transport section(LEBT)of the Xi’an Proton Application Facility.Optimal currents are determined when the highest transmission of a radio frequency quadrupole(RFQ)accelerator downstream of the LEBT is achieved.The optimal work points of the tuned accelerator were obtained with currents of 0 A,0 A,0 A,and 0.1 A,for the four steering magnets,and 107 A and 96 A for the two solenoids.Furthermore,the highest transmission of the RFQ was 91.2%.Meanwhile,the lower time required for the optimization with the A3C algorithm was successfully verified.Optimization with the A3C algorithm consumed 42%and 78%less time than pointwise scanning with random initialization and pre-trained initialization of weights,respectively.
文摘A new static task scheduling algorithm named edge-zeroing based on dynamic critical paths is proposed. The main ideas of the algorithm are as follows: firstly suppose that all of the tasks are in different clusters; secondly, select one of the critical paths of the partially clustered directed acyclic graph; thirdly, try to zero one of graph communication edges; fourthly, repeat above three processes until all edges are zeroed; finally, check the generated clusters to see if some of them can be further merged without increasing the parallel time. Comparisons of the previous algorithms with edge-zeroing based on dynamic critical paths show that the new algorithm has not only a low complexity but also a desired performance comparable or even better on average to much higher complexity heuristic algorithms.
文摘为解决由于固定温度SAC(Soft Actor Critic)算法中存在的Q函数高估可能会导致算法陷入局部最优的问题,通过深入分析提出了一个稳定且受限的SAC算法(SCSAC:Stable Constrained Soft Actor Critic)。该算法通过改进最大熵目标函数修复固定温度SAC算法中的Q函数高估问题,同时增强算法在测试过程中稳定性的效果。最后,在4个OpenAI Gym Mujoco环境下对SCSAC算法进行了验证,实验结果表明,稳定且受限的SAC算法相比固定温度SAC算法可以有效减小Q函数高估出现的次数并能在测试中获得更加稳定的结果。