
The last couple of months I have been in the process of selecting a supplier to install solar panels on my house. What I noticed was that all suppliers use their own calculations and the fact that my house has no default orientation makes it difficult for them. I am in the unlucky situation that the majority of the solar panels need to be placed on a vertical wall instead of on a roof.
So in order to compare the different suppliers on generated power and to have good calculations for panels with a tilt of 90 degrees, I started making my own calculations. For this I created a calculation method where you can enter your own configuration for several panel locations with their own characteristics. The results of these locations will be combined. Different locations can be for example the front and back roof of your house.
Obtain solar irradiation information
I started making my own solar path calculations but found the excellent pvlib package at github. It saved my day by solving the majority of issues for me.
The pvlib package is used to determine the amount of solar irradiation and the generated power for the solar panels. In my case it uses the solar irradiation data from 2005 till 2020 as it is made available by the European Commission. Real life data is used instead of some optimal situation with sun shine all the time.
The following code obtains the data for a set of panels:
This method takes the location (latitude, longitude) and panel configuration to obtain the irradiation and power data. The panel configuration consists of the number of panels, their azimuth, tilt angle and peak power. The else-statement and following code is required to be able to specify zero panels. The library cannot handle a peak power of ‘0’ when no panels are specified. For the Comparison of different layout options, it might be useful to specify zero. The used data source contains data for the years 2005 till 2020.
A request results in a data frame:

This data frame contains for every hour of every day between 01/01/2005 and 31/12/2020 the solar irradiation and generated Power.
Note that an azimuth of ‘0’ is South and not North. A positive azimuth is east, a negative west.
Panel location
Not all panels might be placed on the same roof, so we want to be able to specify a configuration of several locations (e.g. roofs, on a shed, on the groud, on a wall, etc). This configuration is as follows:
[{'name': 'front', 'tilt': 35, 'azimuth': 30, 'nopanels': 2, 'power': 0.385},
{'name': 'back', 'tilt': 35, 'azimuth': 150, 'nopanels': 8, 'power': 0.385}]
It is possible to specify the azimuth (south = 0), tilt, number of panels and peak power per panel for each section. The last one is required in case different panels are used.
A wrapper is build around the obtain_power_panel_data
method that accepts the configuration and obtains data for each roof/section.
This method accepts the dictionary with panel locations and returns a data frame with the solar data for all panels. It also obtains the sun location (lines 27–30) and adds this to the solar radiation information. For convenience reasons the month ans season columns are added.
An example usage is:
This specifies the location names ‘front’ with two panels looking South-West on a roof at 35 degrees. The second location is looking North-Westcontains 8 panels at the same angle of 35 degrees. The same solar panels are used with a peak performance of 0.385 kW.
It is possible to create the dictionary in one line of code but the arrays are used for readability. The resulting data frame is:

This is the complete date set we need for evaluation; for every hour of every day for each set of solar panels the solar irradiation and generated power is available. Each record also contains the suns position in the sky. We are now ready to evaluate the performance of the solar panels.
Performance during one day
First, we will evaluate the performance over a single day. For this, we first search for the day with the highest irradiation in the data set. Then we plot the generated power over the duration of this day. For each individual roof and for the total generated power.
First, we search for the day with the highest generated output. In this case for the year 2020 (line 35). The data set is filtered for this year (line 2) and then the summation of generated power (column ‘P’) per day is calculated (line 3), returning a data frame with the date as index and generated power as column. After finding the maximum value with idxmax
(line 4) the corresponding date is returned as string in the form 2020–05–28 Line 5).
This date and the full data frame is given to the plot_a_day
method that results in:

After filtering the data for the given date (line 7), two plots are generated. The left one with the power generated over the day per time unit and the right graph with the cumulative generated power over the day. The seperate panel locations are plotted with dotted lines, the total over all panels with the solid line.
Line 9 creates two axes for the graphs. Line 10 iterates over all panel locations. Line 11 filters the data for the current location and plots the hourly data (day_data[day_data.location == name]['P']
) in dotted line in the left graph. Line 12 plots the cumulative data (...['P'].cumsum()
) in the right plot. Line 13 and 14 add the daily total to the right of the graph. Line 16 to 19 do the same but for the total over all locations. Finally, lines 21 to 31 format the axes, graph and adds the appropriate labels.
Performance over the months
The next step is looking at the performance over the months of a year:
This generates an overview of the power generated per month and the cumulative over the year:

The code follows roughly the same structure as the plot over a day, instead filtering is done on year level and summations over months.
Additionally, line 8 calculates the average per month, line 9 plots this value as a dotted horizontal line and line 10 plots the average value next to the line.
Performance over the years
The data frame contains data for the years 2005 till 2020. This means we can calculate the performance over the years (not taking performance reduction in account). This simulates the situation where the solar panels would have been in place during this period:
Resulting in:

Again the method starts with aggregating the data but now per year. Then (lines 5–7) plot the generated power per month and the cumulative sums. The average per month is added (lines 13–14) and finally the graph is formatted.
As we can see in this graph, yearly performance is far from constant. In this example it fluctuates between 2800 kWh and 3200 kWh per yer, depending on the weather per year.
BONUS: Seasonal performance
With solar panels, it is al about your total power generation over the year. But during the year, there can be some striking results, depending on your geographical location. The path of the sun differs over the yer and can have an effect on the effectiveness of the solar panels. When we take the panel locations of this article and plot a day in each season (day with highest power generation) some remarkable results can be seen:

The graph shows the sun location over the day, the height of the sun and per panel location the generated power. During autumn and summer the panels ‘back’ perform way better (3x) than the panels ‘front’. But during winter and fall the difference is small (~1.2x). The performance of the ‘front’ panels is relative constant over the year, except for the fall.
Drawing these graphs is a bit more complicated:
The plot_seasons
method filters the data for the request year (line 22) and determines the maximum generated power (lines 23–24), rounded up to tens, that will be used to give all graphs the same y-axis limits. Next, a grid is created for the four graphs (lines 26–27).
For each season the day with the highest generated power is determined by get_best_day_of_season
and the day is plotted in one of the four grid fields.
The plot_day_extended
method splits the axis in a left and right axis and plots the sun elevation and zenith on the left axis (line 9–10) and the panel performance on the right axis (lines 11–12). Lines 13–19 finalize the formatting of the graphs.
Conclusion
The pvlib library simplifies the calculation of solar panel performance. This makes it easier to compare the quotations of different contractors. Their quotations all use different ways to calculate the effectiveness and with the code above they can be compared equivalent.
Calculating the financial impact is kept out of this comparison since this is highly dependent on the economical circumstances and lows. Comparing the generated power is more transparent.
The full notebook can be found on github. As an additional bonus, it contains code to geocode an address to lat-lon coordinates to simplify usage. Address and panel locations can be specified in the third cell from the bottom. Enjoy!
Final words
I hope you enjoyed this article. For more inspiration check some of my other articles:
- Perform a function on columns in a CSV file
- Create a heatmap from the logs of your activity tracker
- Remove personal information from text with Python
- Parallel web requests with Python
- All public transport leads to Utrecht, not Rome
If you like this story, please hit the Follow button!
Disclaimer: The views and opinions included in this article belong only to the author.