When a line failure occurs in a power grid, a load transfer is implemented to reconfigure the network by changingthe states of tie-switches and load demands. Computation speed is one of the major performance indicator...When a line failure occurs in a power grid, a load transfer is implemented to reconfigure the network by changingthe states of tie-switches and load demands. Computation speed is one of the major performance indicators inpower grid load transfer, as a fast load transfer model can greatly reduce the economic loss of post-fault powergrids. In this study, a reinforcement learning method is developed based on a deep deterministic policy gradient.The tedious training process of the reinforcement learning model can be conducted offline, so the model showssatisfactory performance in real-time operation, indicating that it is suitable for fast load transfer. Consideringthat the reinforcement learning model performs poorly in satisfying safety constraints, a safe action-correctionframework is proposed to modify the learning model. In the framework, the action of load shedding is correctedaccording to sensitivity analysis results under a small discrete increment so as to match the constraints of line flowlimits. The results of case studies indicate that the proposed method is practical for fast and safe power grid loadtransfer.展开更多
This paper studies price-based residential demand response management(PB-RDRM)in smart grids,in which non-dispatchable and dispatchable loads(including general loads and plug-in electric vehicles(PEVs))are both involv...This paper studies price-based residential demand response management(PB-RDRM)in smart grids,in which non-dispatchable and dispatchable loads(including general loads and plug-in electric vehicles(PEVs))are both involved.The PB-RDRM is composed of a bi-level optimization problem,in which the upper-level dynamic retail pricing problem aims to maximize the profit of a utility company(UC)by selecting optimal retail prices(RPs),while the lower-level demand response(DR)problem expects to minimize the comprehensive cost of loads by coordinating their energy consumption behavior.The challenges here are mainly two-fold:1)the uncertainty of energy consumption and RPs;2)the flexible PEVs’temporally coupled constraints,which make it impossible to directly develop a model-based optimization algorithm to solve the PB-RDRM.To address these challenges,we first model the dynamic retail pricing problem as a Markovian decision process(MDP),and then employ a model-free reinforcement learning(RL)algorithm to learn the optimal dynamic RPs of UC according to the loads’responses.Our proposed RL-based DR algorithm is benchmarked against two model-based optimization approaches(i.e.,distributed dual decomposition-based(DDB)method and distributed primal-dual interior(PDI)-based method),which require exact load and electricity price models.The comparison results show that,compared with the benchmark solutions,our proposed algorithm can not only adaptively decide the RPs through on-line learning processes,but also achieve larger social welfare within an unknown electricity market environment.展开更多
Uplift response of symmetrical anchor plates with and without grid fixed reinforced (GFR) reinforcement was evaluated in model tests and numerical simulations by Plaxis. Many variations of reinforcement layers were ...Uplift response of symmetrical anchor plates with and without grid fixed reinforced (GFR) reinforcement was evaluated in model tests and numerical simulations by Plaxis. Many variations of reinforcement layers were used to reinforce the sandy soil over symmetrical anchor plates. In the current research, different factors such as relative density of sand, embedment ratios, and various GFR parameters including size, number of layers, and the proximity of the layer to the symmetrical anchor plate were investigated in a scale model. The failure mechanism and the associated rupture surface were observed and evaluated. GFR, a tied up system made of fiber reinforcement polymer (FRP) strips and end balls, was connected to the geosynthetic material and anchored into the soil. Test results showed that using GFR reinforcement significantly improved the uplift capacity of anchor plates. It was found that the inclusion of one layer of GFR, which rested directly on the top of the anchor plate, was more effective in enhancing the anchor capacity itself than other methods. It was found that by including GFR the uplift response was improved by 29%. Multi layers of GFR proved more effective in enhancing the uplift capacity than a single GFR reinforcement. This is due to the additional anchorage provided by the GFR at each level of reinforcement. In general, the results show that the uplift capacity of symmetrical anchor plates in loose and dense sand can be significantly increased by the inclusion of GFR. It was also observed that the inclusion of GFR reduced the requirement for a large L/D ratio to achieve the required uplift capacity. The laboratory and numerical analysis results are found to be in agreement in terms of breakout factor and failure mechanism pattern.展开更多
A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is ...A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is considered. To fully exploit the capability of the community microgrid to operate in either grid-connected mode or islanded mode, as well as to achieve improved stability of the microgrid system, universal droop control, virtual inertia control, and a reinforcement learning-based control mechanism are combined in a cohesive manner, in which adaptive control parameters are determined online to tune the influence of the controllers. The microgrid model and control mechanisms are implemented in MATLAB/Simulink and set up in real-time simulation to test the feasibility and effectiveness of the proposed model. Experiment results reveal the effectiveness of regulating the controller’s frequency and voltage for various operating conditions and scenarios of a microgrid.展开更多
Real-time fault detection is important for operation of smart grid.It has become a trend of future development to design an anomaly detection system based on deep learning by using the powerful computing power of the ...Real-time fault detection is important for operation of smart grid.It has become a trend of future development to design an anomaly detection system based on deep learning by using the powerful computing power of the cloud.However,delay of Internet transmission is large,which may make the delay time of detection and transmission go beyond the limits.However,the edge-based scheme may not be able to undertake all data detection tasks due to limited computing resources of edge devices.Therefore,we propose a cloud-edge collaborative smart grid fault detection system,next to which edge devices are placed,and equipped with a lightweight neural network with different precision for fault detection.In addition,a sub-optimal and realtime communication and computing resource allocation method is proposed based on deep reinforcement learning.This method greatly speeds up solution time,which can meet the requirements of data transmission delay,maximize the system throughput,and improve communication efficiency.Simulation results show the scheme is superior in transmission delay and improves real-time performance of the smart grid detection system.展开更多
Traditional reinforcement learning (RL) uses the return, also known as the expected value of cumulative random rewards, for training an agent to learn an optimal policy. However, recent research indicates that learnin...Traditional reinforcement learning (RL) uses the return, also known as the expected value of cumulative random rewards, for training an agent to learn an optimal policy. However, recent research indicates that learning the distribution over returns has distinct advantages over learning their expected value as seen in different RL tasks. The shift from using the expectation of returns in traditional RL to the distribution over returns in distributional RL has provided new insights into the dynamics of RL. This paper builds on our recent work investigating the quantum approach towards RL. Our work implements the quantile regression (QR) distributional Q learning with a quantum neural network. This quantum network is evaluated in a grid world environment with a different number of quantiles, illustrating its detailed influence on the learning of the algorithm. It is also compared to the standard quantum Q learning in a Markov Decision Process (MDP) chain, which demonstrates that the quantum QR distributional Q learning can explore the environment more efficiently than the standard quantum Q learning. Efficient exploration and balancing of exploitation and exploration are major challenges in RL. Previous work has shown that more informative actions can be taken with a distributional perspective. Our findings suggest another cause for its success: the enhanced performance of distributional RL can be partially attributed to its superior ability to efficiently explore the environment.展开更多
The advantage of quantum computers over classical computers fuels the recent trend of developing machine learning algorithms on quantum computers, which can potentially lead to breakthroughs and new learning models in...The advantage of quantum computers over classical computers fuels the recent trend of developing machine learning algorithms on quantum computers, which can potentially lead to breakthroughs and new learning models in this area. The aim of our study is to explore deep quantum reinforcement learning (RL) on photonic quantum computers, which can process information stored in the quantum states of light. These quantum computers can naturally represent continuous variables, making them an ideal platform to create quantum versions of neural networks. Using quantum photonic circuits, we implement Q learning and actor-critic algorithms with multilayer quantum neural networks and test them in the grid world environment. Our experiments show that 1) these quantum algorithms can solve the RL problem and 2) compared to one layer, using three layer quantum networks improves the learning of both algorithms in terms of rewards collected. In summary, our findings suggest that having more layers in deep quantum RL can enhance the learning outcome.展开更多
随着工业4.0的快速推进,与之互联的电力数据采集与监视控制(Supervisory Control and Data Acquisition,SCADA)系统逐渐趋于信息化和智能化。由于这些系统本身具有脆弱性以及受到攻击和防御能力的不对等性,使得系统存在各种安全隐患。...随着工业4.0的快速推进,与之互联的电力数据采集与监视控制(Supervisory Control and Data Acquisition,SCADA)系统逐渐趋于信息化和智能化。由于这些系统本身具有脆弱性以及受到攻击和防御能力的不对等性,使得系统存在各种安全隐患。近年来,针对电力攻击事件频发,亟需提出针对智能电网的攻击缓解方法。蜜罐作为一种高效的欺骗防御方法,能够有效地收集智能电网中的攻击行为。针对现有的智能电网蜜罐中存在的交互深度不足、物理工业过程仿真缺失、扩展性差的问题,设计并实现了一种基于强化学习的智能电网蜜罐框架——SGPot,它能够基于电力行业真实设备中的系统不变量模拟智能变电站控制端,通过电力业务流程的仿真来提升蜜罐欺骗性,诱使攻击者与蜜罐深度交互。为了评估蜜罐框架的性能,搭建了小型智能变电站实验验证环境,同时将SGPot和现有的GridPot以及SHaPe蜜罐同时部署在公网环境中,收集了30天的交互数据。实验结果表明,SGPot收集到的请求数据比GridPot多20%,比SHaPe多75%。SGPot能够诱骗攻击者与蜜罐进行更深度的交互,获取到的交互会话长度大于6的会话数量多于GridPot和SHaPe。展开更多
The massive integration of communication and information technology with the large-scale power grid has enhanced the efficiency, safety, and economical operation of cyber-physical systems. However, the open and divers...The massive integration of communication and information technology with the large-scale power grid has enhanced the efficiency, safety, and economical operation of cyber-physical systems. However, the open and diversified communication environment of the smart grid is exposed to cyber-attacks. Data integrity attacks that can bypass conventional security techniques have been considered critical threats to the operation of the grid. Current detection techniques cannot learn the dynamic and heterogeneous characteristics of the smart grid and are unable to deal with non-euclidean data types. To address the issue, we propose a novel Deep-Q-Network scheme empowered with a graph convolutional network (GCN) framework to detect data integrity attacks in cyber-physical systems. The simulation results show that the proposed framework is scalable and achieves higher detection accuracy, unlike other benchmark techniques.展开更多
In terms of model-free voltage control methods,when the device or topology of the system changes,the model’s accuracy often decreases,so an adaptive model is needed to coordinate the changes of input.To overcome the ...In terms of model-free voltage control methods,when the device or topology of the system changes,the model’s accuracy often decreases,so an adaptive model is needed to coordinate the changes of input.To overcome the defects of a model-free control method,this paper proposes an automatic voltage control(AVC)method for differential power grids based on transfer learning and deep reinforcement learning.First,when constructing the Markov game of AVC,both the magnitude and number of voltage deviations are taken into account in the reward.Then,an AVC method based on constrained multiagent deep reinforcement learning(DRL)is developed.To further improve learning efficiency,domain knowledge is used to reduce action space.Next,distribution adaptation transfer learning is introduced for the AVC transfer circumstance of systems with the same structure but distinct topological relations/parameters,which can perform well without any further training even if the structure changes.Moreover,for the AVC transfer circumstance of various power grids,parameter-based transfer learning is created,which enhances the target system’s training speed and effect.Finally,the method’s efficacy is tested using two IEEE systems and two real-world power grids.展开更多
基金the Incubation Project of State Grid Jiangsu Corporation of China“Construction and Application of Intelligent Load Transferring Platform for Active Distribution Networks”(JF2023031).
文摘When a line failure occurs in a power grid, a load transfer is implemented to reconfigure the network by changingthe states of tie-switches and load demands. Computation speed is one of the major performance indicators inpower grid load transfer, as a fast load transfer model can greatly reduce the economic loss of post-fault powergrids. In this study, a reinforcement learning method is developed based on a deep deterministic policy gradient.The tedious training process of the reinforcement learning model can be conducted offline, so the model showssatisfactory performance in real-time operation, indicating that it is suitable for fast load transfer. Consideringthat the reinforcement learning model performs poorly in satisfying safety constraints, a safe action-correctionframework is proposed to modify the learning model. In the framework, the action of load shedding is correctedaccording to sensitivity analysis results under a small discrete increment so as to match the constraints of line flowlimits. The results of case studies indicate that the proposed method is practical for fast and safe power grid loadtransfer.
基金This work was supported in part by the National Natural Science Foundation of China(61922076,61725304,61873252,61991403,61991400)in part by the Australian Research Council Discovery Program(DP200101199).
文摘This paper studies price-based residential demand response management(PB-RDRM)in smart grids,in which non-dispatchable and dispatchable loads(including general loads and plug-in electric vehicles(PEVs))are both involved.The PB-RDRM is composed of a bi-level optimization problem,in which the upper-level dynamic retail pricing problem aims to maximize the profit of a utility company(UC)by selecting optimal retail prices(RPs),while the lower-level demand response(DR)problem expects to minimize the comprehensive cost of loads by coordinating their energy consumption behavior.The challenges here are mainly two-fold:1)the uncertainty of energy consumption and RPs;2)the flexible PEVs’temporally coupled constraints,which make it impossible to directly develop a model-based optimization algorithm to solve the PB-RDRM.To address these challenges,we first model the dynamic retail pricing problem as a Markovian decision process(MDP),and then employ a model-free reinforcement learning(RL)algorithm to learn the optimal dynamic RPs of UC according to the loads’responses.Our proposed RL-based DR algorithm is benchmarked against two model-based optimization approaches(i.e.,distributed dual decomposition-based(DDB)method and distributed primal-dual interior(PDI)-based method),which require exact load and electricity price models.The comparison results show that,compared with the benchmark solutions,our proposed algorithm can not only adaptively decide the RPs through on-line learning processes,but also achieve larger social welfare within an unknown electricity market environment.
基金supported by the research Grant at UTM,Malaysia(GUP Grant)the project name is"uplift response of symmetrical anchor plates in grid fixed reinforced in cohesionless soil"
文摘Uplift response of symmetrical anchor plates with and without grid fixed reinforced (GFR) reinforcement was evaluated in model tests and numerical simulations by Plaxis. Many variations of reinforcement layers were used to reinforce the sandy soil over symmetrical anchor plates. In the current research, different factors such as relative density of sand, embedment ratios, and various GFR parameters including size, number of layers, and the proximity of the layer to the symmetrical anchor plate were investigated in a scale model. The failure mechanism and the associated rupture surface were observed and evaluated. GFR, a tied up system made of fiber reinforcement polymer (FRP) strips and end balls, was connected to the geosynthetic material and anchored into the soil. Test results showed that using GFR reinforcement significantly improved the uplift capacity of anchor plates. It was found that the inclusion of one layer of GFR, which rested directly on the top of the anchor plate, was more effective in enhancing the anchor capacity itself than other methods. It was found that by including GFR the uplift response was improved by 29%. Multi layers of GFR proved more effective in enhancing the uplift capacity than a single GFR reinforcement. This is due to the additional anchorage provided by the GFR at each level of reinforcement. In general, the results show that the uplift capacity of symmetrical anchor plates in loose and dense sand can be significantly increased by the inclusion of GFR. It was also observed that the inclusion of GFR reduced the requirement for a large L/D ratio to achieve the required uplift capacity. The laboratory and numerical analysis results are found to be in agreement in terms of breakout factor and failure mechanism pattern.
文摘A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is considered. To fully exploit the capability of the community microgrid to operate in either grid-connected mode or islanded mode, as well as to achieve improved stability of the microgrid system, universal droop control, virtual inertia control, and a reinforcement learning-based control mechanism are combined in a cohesive manner, in which adaptive control parameters are determined online to tune the influence of the controllers. The microgrid model and control mechanisms are implemented in MATLAB/Simulink and set up in real-time simulation to test the feasibility and effectiveness of the proposed model. Experiment results reveal the effectiveness of regulating the controller’s frequency and voltage for various operating conditions and scenarios of a microgrid.
基金supported in part by the National Natural Science Foundation of China(52077049,52277087)Anhui Provincial Natural Science Foundation(2108085UD07)the 111 Project(BP0719039).
文摘Real-time fault detection is important for operation of smart grid.It has become a trend of future development to design an anomaly detection system based on deep learning by using the powerful computing power of the cloud.However,delay of Internet transmission is large,which may make the delay time of detection and transmission go beyond the limits.However,the edge-based scheme may not be able to undertake all data detection tasks due to limited computing resources of edge devices.Therefore,we propose a cloud-edge collaborative smart grid fault detection system,next to which edge devices are placed,and equipped with a lightweight neural network with different precision for fault detection.In addition,a sub-optimal and realtime communication and computing resource allocation method is proposed based on deep reinforcement learning.This method greatly speeds up solution time,which can meet the requirements of data transmission delay,maximize the system throughput,and improve communication efficiency.Simulation results show the scheme is superior in transmission delay and improves real-time performance of the smart grid detection system.
文摘Traditional reinforcement learning (RL) uses the return, also known as the expected value of cumulative random rewards, for training an agent to learn an optimal policy. However, recent research indicates that learning the distribution over returns has distinct advantages over learning their expected value as seen in different RL tasks. The shift from using the expectation of returns in traditional RL to the distribution over returns in distributional RL has provided new insights into the dynamics of RL. This paper builds on our recent work investigating the quantum approach towards RL. Our work implements the quantile regression (QR) distributional Q learning with a quantum neural network. This quantum network is evaluated in a grid world environment with a different number of quantiles, illustrating its detailed influence on the learning of the algorithm. It is also compared to the standard quantum Q learning in a Markov Decision Process (MDP) chain, which demonstrates that the quantum QR distributional Q learning can explore the environment more efficiently than the standard quantum Q learning. Efficient exploration and balancing of exploitation and exploration are major challenges in RL. Previous work has shown that more informative actions can be taken with a distributional perspective. Our findings suggest another cause for its success: the enhanced performance of distributional RL can be partially attributed to its superior ability to efficiently explore the environment.
文摘The advantage of quantum computers over classical computers fuels the recent trend of developing machine learning algorithms on quantum computers, which can potentially lead to breakthroughs and new learning models in this area. The aim of our study is to explore deep quantum reinforcement learning (RL) on photonic quantum computers, which can process information stored in the quantum states of light. These quantum computers can naturally represent continuous variables, making them an ideal platform to create quantum versions of neural networks. Using quantum photonic circuits, we implement Q learning and actor-critic algorithms with multilayer quantum neural networks and test them in the grid world environment. Our experiments show that 1) these quantum algorithms can solve the RL problem and 2) compared to one layer, using three layer quantum networks improves the learning of both algorithms in terms of rewards collected. In summary, our findings suggest that having more layers in deep quantum RL can enhance the learning outcome.
文摘随着工业4.0的快速推进,与之互联的电力数据采集与监视控制(Supervisory Control and Data Acquisition,SCADA)系统逐渐趋于信息化和智能化。由于这些系统本身具有脆弱性以及受到攻击和防御能力的不对等性,使得系统存在各种安全隐患。近年来,针对电力攻击事件频发,亟需提出针对智能电网的攻击缓解方法。蜜罐作为一种高效的欺骗防御方法,能够有效地收集智能电网中的攻击行为。针对现有的智能电网蜜罐中存在的交互深度不足、物理工业过程仿真缺失、扩展性差的问题,设计并实现了一种基于强化学习的智能电网蜜罐框架——SGPot,它能够基于电力行业真实设备中的系统不变量模拟智能变电站控制端,通过电力业务流程的仿真来提升蜜罐欺骗性,诱使攻击者与蜜罐深度交互。为了评估蜜罐框架的性能,搭建了小型智能变电站实验验证环境,同时将SGPot和现有的GridPot以及SHaPe蜜罐同时部署在公网环境中,收集了30天的交互数据。实验结果表明,SGPot收集到的请求数据比GridPot多20%,比SHaPe多75%。SGPot能够诱骗攻击者与蜜罐进行更深度的交互,获取到的交互会话长度大于6的会话数量多于GridPot和SHaPe。
文摘The massive integration of communication and information technology with the large-scale power grid has enhanced the efficiency, safety, and economical operation of cyber-physical systems. However, the open and diversified communication environment of the smart grid is exposed to cyber-attacks. Data integrity attacks that can bypass conventional security techniques have been considered critical threats to the operation of the grid. Current detection techniques cannot learn the dynamic and heterogeneous characteristics of the smart grid and are unable to deal with non-euclidean data types. To address the issue, we propose a novel Deep-Q-Network scheme empowered with a graph convolutional network (GCN) framework to detect data integrity attacks in cyber-physical systems. The simulation results show that the proposed framework is scalable and achieves higher detection accuracy, unlike other benchmark techniques.
基金supported by the National Science Foundation of China(U1866602).
文摘In terms of model-free voltage control methods,when the device or topology of the system changes,the model’s accuracy often decreases,so an adaptive model is needed to coordinate the changes of input.To overcome the defects of a model-free control method,this paper proposes an automatic voltage control(AVC)method for differential power grids based on transfer learning and deep reinforcement learning.First,when constructing the Markov game of AVC,both the magnitude and number of voltage deviations are taken into account in the reward.Then,an AVC method based on constrained multiagent deep reinforcement learning(DRL)is developed.To further improve learning efficiency,domain knowledge is used to reduce action space.Next,distribution adaptation transfer learning is introduced for the AVC transfer circumstance of systems with the same structure but distinct topological relations/parameters,which can perform well without any further training even if the structure changes.Moreover,for the AVC transfer circumstance of various power grids,parameter-based transfer learning is created,which enhances the target system’s training speed and effect.Finally,the method’s efficacy is tested using two IEEE systems and two real-world power grids.