期刊文献+

Comparison and Adaptation of Two Strategies for Anomaly Detection in Load Profiles Based on Methods from the Fields of Machine Learning and Statistics

Comparison and Adaptation of Two Strategies for Anomaly Detection in Load Profiles Based on Methods from the Fields of Machine Learning and Statistics
下载PDF
导出
摘要 <span style="font-family:Verdana;font-size:12px;">The Federal Office for Economic Affairs and Export Control (BAFA) of</span><span style="font-family:Verdana;font-size:12px;"> Germany promotes digital concepts for increasing energy efficiency as part of the “Pilotprogramm Einsparz<span style="white-space:nowrap;">&#228;</span>hler”. Within this program, Limón GmbH is developing software solutions in cooperation with the University of Kassel to identify efficiency potentials in load profiles by means of automated anomaly detection. Therefore, in this study two strategies for anomaly detection in load profiles are evaluated. To estimate the monthly load profile, strategy 1 uses the artificial neural network LSTM (Long Short-Term Memory), with a data period of one month (1</span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:Verdana;font-size:12px;">M) or three months (3</span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:'';font-size:10pt;"><span style="font-size:12px;font-family:Verdana;">M), and strategy 2 uses the smoothing method PEWMA (Probalistic Exponential Weighted Moving Average). By comparing with original load profile data, residuals or summed residuals of the sequence lengths of two, four, six and eight hours are identified as an anomaly by exceeding a predefined threshold. The thresholds are defined by the Z-Score test, </span><i><span style="font-size:12px;font-family:Verdana;">i</span></i><span style="font-size:12px;font-family:Verdana;">.</span><i><span style="font-size:12px;font-family:Verdana;">e</span></i><span style="font-size:12px;font-family:Verdana;">., residuals greater than 2, 2.5 or 3 standard deviations are considered anomalous. Furthermore, the ESD (Extreme Studentized Deviate) test is used to set thresholds by means of three significance level values of 0.05, 0.10 and 0.15, with a maximum of </span><i><span style="font-size:12px;font-family:Verdana;">k</span></i><span style="font-size:12px;font-family:Verdana;"> = 40 iterations. Five load profiles are examined, which were obtained by the cluster method </span><i><span style="font-size:12px;font-family:Verdana;">k</span></i><span style="font-size:12px;font-family:Verdana;">-Means as a representative sample from all available data sets of the Limón GmbH. The evaluation shows that for strategy 1 a maximum </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value of 0.4 (1</span></span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:'';font-size:10pt;"><span style="font-size:12px;font-family:Verdana;">M) and for all examined companies an average </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value of maximum 0.24 and standard deviation of 0.09 (1</span></span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:Verdana;font-size:12px;">M) could be achieved for the investigation on single residuals. In variant 3</span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:'';font-size:10pt;"><span style="font-size:12px;font-family:Verdana;">M the highest </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value could be achieved with an average </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value of 0.21 and standard deviation of 0.06 (3</span></span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:Verdana;font-size:12px;">M) for summed residuals of the partial sequence length of four hours. The PEWMA-based strategy 2 did not show a higher anomaly detection efficacy compared to strategy 1 in any of the investigated companies.</span> <span style="font-family:Verdana;font-size:12px;">The Federal Office for Economic Affairs and Export Control (BAFA) of</span><span style="font-family:Verdana;font-size:12px;"> Germany promotes digital concepts for increasing energy efficiency as part of the “Pilotprogramm Einsparz<span style="white-space:nowrap;">&#228;</span>hler”. Within this program, Limón GmbH is developing software solutions in cooperation with the University of Kassel to identify efficiency potentials in load profiles by means of automated anomaly detection. Therefore, in this study two strategies for anomaly detection in load profiles are evaluated. To estimate the monthly load profile, strategy 1 uses the artificial neural network LSTM (Long Short-Term Memory), with a data period of one month (1</span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:Verdana;font-size:12px;">M) or three months (3</span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:'';font-size:10pt;"><span style="font-size:12px;font-family:Verdana;">M), and strategy 2 uses the smoothing method PEWMA (Probalistic Exponential Weighted Moving Average). By comparing with original load profile data, residuals or summed residuals of the sequence lengths of two, four, six and eight hours are identified as an anomaly by exceeding a predefined threshold. The thresholds are defined by the Z-Score test, </span><i><span style="font-size:12px;font-family:Verdana;">i</span></i><span style="font-size:12px;font-family:Verdana;">.</span><i><span style="font-size:12px;font-family:Verdana;">e</span></i><span style="font-size:12px;font-family:Verdana;">., residuals greater than 2, 2.5 or 3 standard deviations are considered anomalous. Furthermore, the ESD (Extreme Studentized Deviate) test is used to set thresholds by means of three significance level values of 0.05, 0.10 and 0.15, with a maximum of </span><i><span style="font-size:12px;font-family:Verdana;">k</span></i><span style="font-size:12px;font-family:Verdana;"> = 40 iterations. Five load profiles are examined, which were obtained by the cluster method </span><i><span style="font-size:12px;font-family:Verdana;">k</span></i><span style="font-size:12px;font-family:Verdana;">-Means as a representative sample from all available data sets of the Limón GmbH. The evaluation shows that for strategy 1 a maximum </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value of 0.4 (1</span></span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:'';font-size:10pt;"><span style="font-size:12px;font-family:Verdana;">M) and for all examined companies an average </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value of maximum 0.24 and standard deviation of 0.09 (1</span></span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:Verdana;font-size:12px;">M) could be achieved for the investigation on single residuals. In variant 3</span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:'';font-size:10pt;"><span style="font-size:12px;font-family:Verdana;">M the highest </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value could be achieved with an average </span><i><span style="font-size:12px;font-family:Verdana;">F</span><sub><span style="font-size:12px;font-family:Verdana;">1</span></sub></i><span style="font-size:12px;font-family:Verdana;">-value of 0.21 and standard deviation of 0.06 (3</span></span><span style="font-family:'';font-size:10pt;"> </span><span style="font-family:Verdana;font-size:12px;">M) for summed residuals of the partial sequence length of four hours. The PEWMA-based strategy 2 did not show a higher anomaly detection efficacy compared to strategy 1 in any of the investigated companies.</span>
作者 Patrick Krawiec Mark Junge Jens Hesselbach Patrick Krawiec;Mark Junge;Jens Hesselbach(Limón GmbH, Kassel, Germany;Department for Sustainable Products and Processes (Upp), University Kassel, Kassel, Germany)
出处 《Open Journal of Energy Efficiency》 2021年第2期37-49,共13页 能源效率(英文)
关键词 Energy Efficiency Anomaly Detection Load Profiles LSTM PEWMA Energy Efficiency Anomaly Detection Load Profiles LSTM PEWMA
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部