Forecasting Renewable Energy Generation with Streamlit and sktime

Developing a Streamlit web app that provides forecasts for renewable energy generation in EU countries

Giannis Tolios
Towards Data Science

--

Photo by Nicholas Doherty on Unsplash

Climate change is undoubtedly one of the biggest challenges that humanity is facing. The past decade was the warmest on record, and 2019 ended with a global mean temperature of 1.1°C above pre-industrial levels. Rising global temperatures have a significant impact on every aspect of human well-being, such as health and food security. Numerous people have died because of heatwaves in recent years, while others are suffering from hunger due to climate variability and extreme weather events.

Scientists are constantly making alarming discoveries related to global warming. For example, it was recently reported that the earth has lost 28 trillion tonnes of ice since 1994, indicating that the average global sea level could rise significantly by the end of the century. According to the Intergovernmental Panel on Climate Change (IPCC), the world must become carbon-neutral by 2050 to have any hope of limiting global warming to 1.5°C, and avoiding the worst effects of climate change. The UN Secretary-General Antonio Guterres has stated that we are currently way off track to meeting this target.

Renewable Energy

There is a scientific consensus that anthropogenic climate change is caused by carbon dioxide and other greenhouse gas (GHG) emissions from human activities. It is therefore crucial to decrease GHG concentrations in the atmosphere to mitigate the problem. Fossil fuel power stations are responsible for a large fraction of greenhouse gas (GHG) emissions, making the transition to low-carbon alternatives more important than ever. The cost of renewable energy generation has dropped significantly in the past few years, making it a viable and cost-effective option, even if we ignore the environmental protection aspect. Unfortunately, a recent study has shown that only 10% of electric utility companies are prioritizing renewable energy, suggesting that the transition is being made very slowly. On the other hand, it is also an indication that the renewable energy industry has huge potential for growth.

Chart by Our World in Data

Forecasting Renewable Energy Generation

There are numerous renewable energy sources that can be used to generate electricity, such as solar, wind, hydroelectric, biomass and geothermal. Most countries are focusing on solar and wind energy, as the alternatives have significant constraints and limitations. For example, hydroelectric and geothermal power stations can only be built in specific areas. Solar and wind energy are more flexible, but their output is dependent on weather conditions and other factors, so they are known as variable renewable energy (VRE)¹. Integrating VRE power stations in the electric grid is challenging due to their fluctuating output. Accurate forecasts can help us overcome this obstacle, and improve electricity scheduling and long-term system planning².

The sktime Python Library

Time series forecasting is typically done with statistical models, such as ARIMA. In recent years though, machine learning algorithms have been used as well. Time series forecasting can be transformed into a supervised learning problem easily, by using the sliding window method. In this method, each value is considered to be a label y, and the previous n (the window length) values are the x features. The resulting dataset can be used to train the regression model of our preference. You can see an example of this (with a window length of n = 10) in the following table.

[ 0  1  2  3  4  5  6  7  8  9] [10]
[ 1 2 3 4 5 6 7 8 9 10] [11]
[ 2 3 4 5 6 7 8 9 10 11] [12]
[ 3 4 5 6 7 8 9 10 11 12] [13]
[ 4 5 6 7 8 9 10 11 12 13] [14]
[ 5 6 7 8 9 10 11 12 13 14] [15]
[ 6 7 8 9 10 11 12 13 14 15] [16]
[ 7 8 9 10 11 12 13 14 15 16] [17]
[ 8 9 10 11 12 13 14 15 16 17] [18]
[ 9 10 11 12 13 14 15 16 17 18] [19]

Sktime is a Python library that provides useful tools for time series forecasting, based on machine learning algorithms. It automatically applies the sliding window method on the time series, while also being compatible with the scikit-learn API. This means that it can easily be used with every scikit-learn regressor, or other compatible libraries like the popular XGBoost algorithm. For an in-depth look at sktime, you can check out this article, or read the research paper that was written by its developers³.

The Streamlit Framework

Streamlit is a Python framework that lets you build web apps for data science projects very quickly. You can easily create a user interface with various widgets, in a few lines of code. Furthermore, Streamlit is a great tool for deploying machine learning models to the web, and adding great visualizations of your data. Streamlit also has a powerful caching mechanism, that optimizes the performance of your app. A detailed introduction to Streamlit is available here.

Developing the Renewcast Web App

Renewcast is a web app that provides forecasts for renewable energy generation in European Union countries. I decided to develop this app for a number of reasons. First of all, I am very concerned about climate change, and I want to explore the various ways that machine learning can help mitigate it. Second, I am always trying to enhance my skill set, and developing this app was a great way to familiarize myself with Streamlit and sktime. I am now going to describe the functionality of the source code, starting with the app.py file.

This is the main Streamlit app. First of all, I imported the Streamlit library, as well as some functions I created myself. After that, I added a title and description for the app, using the associated Streamlit functions. I then proceeded to create a basic user interface, by adding widgets on the standard Streamlit sidebar. Users are able to select a country and regression algorithm, as well as modify the forecast horizon and the window length. Finally, I used the area_chart() and line_chart() functions to plot the total energy generation, as well as the forecasts of renewable energy generation (solar and wind). Let’s continue with the entsoe_client.py file.

The get_energy_data() function interfaces with the ENTSOE-E API, to download the necessary data about energy generation for each EU country. Before defining the function, I inserted the @st.cache decorator to use the caching mechanism of Streamlit. I did that so the ENTSOE-E API will not be called every time we need to use that data, but only when it needs to be updated. This will optimize the performance of our app, and significantly reduce the time needed to run it. Let’s move on now, and examine the functionality of forecast.py , the last of the main source code files.

The select_regressor() function simply maps the regression algorithm options of the user interface, to the associated scikit-learn classes. I have included some typical scikit-learn regressions algorithms, such as Linear Regression, Random Forest and Gradient Boosting, but any regressor that is compatible with the scikit-learn API should work. The generate_forecast() function is responsible for the main functionality of the application, i.e. forecasting the future values of the energy generation time series. I accomplished that with the ReducedRegressionForecaster class, which applies the sliding window method to the time series, and then trains a regression model with that data, a technique that was discussed earlier. Finally, I also created the calculate_smape() function, which returns the symmetric mean absolute percentage error (SMAPE), a useful metric that can help us evaluate the accuracy of our forecast.

Conclusion

I hope that after reading this article, you will be encouraged to develop your own Streamlit app, or even fork Renewcast (the Github repository is available here). Perhaps you will be inspired to acquire more knowledge about climate change, as well as the numerous ways it can be mitigated with machine learning. Feel free to share your thoughts in the comments, or follow me on LinkedIn where I regularly post content about data science, climate change and other topics. You can also visit my personal website or check my latest book, titled Simplifying Machine Learning with PyCaret.

References

[1] M. Joos, I. Staffell, Short-term integration costs of variable renewable energy: Wind curtailment and balancing in Britain and Germany (2018), Renewable and Sustainable Energy Reviews

[2] D. Rolnick, P. L. Donti, L. H. Kaack, K. Kochanski, A. Lacoste, K. Sankaran, et al., Tackling Climate Change with Machine Learning (2019), arXiv:1906.05433

[3] M. Löning, F. Király, Forecasting with sktime: Designing sktime’s New Forecasting API and Applying It to Replicate and Extend the M4 Study (2020), arXiv:2005.08067

--

--