Today, we’re going to do some Data Science in Excel! Yes, contrary to what some out there might think, it is possible to do a reasonable amount of analytics in Excel.
Sure statsmodels also has an ANOVA (analysis of variance) library, but I first saw this in Excel many years ago and was like "What’s this?".

ANOVA is a really useful tool for comparing factors and decomposing variance. It’s also got a lot of similarities with clustering. Let’s get into our example.
An ANOVA Example With Financial Data
Here’s some tab denominated data that you can copy and paste directly into Excel. It’s the monthly returns of energy stocks by season – let’s say we have a hypothesis that seasonality is an important determinant of energy stock returns.
(It need to be organized this way in order to properly run the ANOVA analysis.)
Spring Summer Fall Winter
2.13% 4.40% -6.20% -5.35%
4.94% -2.90% -4.45% 4.93%
1.33% 2.74% -7.73% -1.12%
-2.50% -3.80% -7.09% -2.70%
7.67% -5.41% 10.32% 4.91%
-4.62% -5.35% -0.31% -8.38%
7.44% 4.95% 2.76% -2.09%
7.36% -1.38% 0.21% -0.31%
-1.66% 1.18% 5.21% 3.84%
We’re going to run ANOVA on it using Excel’s Data Analysis Tool (pictured above). Bear with me for a second – I will explain what ANOVA is in a second. It’s just much easier to do with an example.
The output looks like:

The Summary table is pretty straightforward. It just provides basic summary statistics such as the mean and variance of each of the variables. Interestingly we can see that over the period of analysis, the average return for energy stocks was only positive during spring (which I define as March, April, May) and energy stocks exhibited the most volatility during the fall.
The second table, ANOVA, is where the action is at. SS stands for sum of squares. Once you’ve spent some time with statistics, you realize it’s all about variance – measuring it, explaining it, decomposing it. In the statistics world, variance is an estimate of uncertainty; so statisticians and data scientists are always trying to model it and explain it.
The equation for variance is:
Sample Variance = sum(Xi - mean(X))^2/(n-1)
The numerator, sum(Xi - mean(X))^2
is what we call sum of sum of squared deviations or sum of squares (SS) for short. It’s simply the sum of the squared differences between each observation and the mean. For example, to calculate total SS for energy stock returns (not bucketed by season), we would:
- Calculate the mean monthly return of the energy index.
- Take the mean we just calculated and subtract it from each of the energy index’s monthly returns.
- Square each value from step 2.
- Sum all the values from step 3.
So What?
Because sum of squared deviations (which I will call SS in this article) is the numerator in the variance equation, it’s highly correlated to variance. The higher SS is, holding the number of observations constant, the higher the variance of a variable is.
Now let’s get back to ANOVA
Why is ANOVA called analysis of variance? Well that’s because ANOVA is a process for attempting to break down variance into its sources.
Remember that we started with a hypothesis – what season it is matters for the performance of energy stocks. Let’s think about how we could test this.
What would we expect the data to look like if season mattered a lot? Well once we sorted the data into buckets by season, we would expect the variation in returns within each season bucket to be very small and the variation across season buckets to be very large. In other words, we would expect a high SS across season buckets and a low SS within each season bucket (uncoincidentally, some clustering algorithms work by maximizing cross cluster variance while minimizing within cluster variance). This is depicted visually below in the plot on the right.
What would we expect if season didn’t matter? It would be the opposite of what we just described. We should expect to see a low SS (low variation) across season buckets and a high SS (high variation) within each bucket. In other words, the variation pre and post sort would look more or less the same (left plot).

It’s starting to sound a lot like a hypothesis test. And at a basic level, that’s what ANOVA is:
ANOVA is a way to test whether the means of different groups are significantly different.
Interpreting Our ANOVA Results
But what I like about ANOVA is the way it breaks out the sources of variation (explaining variation is basically explaining variance).

Here’s the same ANOVA output I showed earlier – the green numbers are stats on the variation between groups (measured by comparing each group’s average to the overall average, i.e. is the average during summer or the average during winter very different from the overall average?) while the yellow numbers are stats on the variation within a group (how much variation is there in energy returns during just the winter)? Total variation is just the total SS for all energy observations regardless of season.
So ANOVA breaks out variation as follows:
Total SS = Between Groups SS + Within Groups SS
Where SS is an estimate for variation
Now we can check for the situations we described above:
- If between group variation is high relative to within group variation, then season must matter.
- If between group variation is not high relative to within group variation, then season must not matter.
The F statistic that we use to quantitatively test what I just described is calculated from the numbers in green and yellow as follows:
F = (SS_between/df_between) / (SS_within/df_within)
= (0.00677/3) / (0.07816/32) = 0.924
The intuition around why we divide SS by df (degrees of freedom) is that we need to do so in order to normalize it. The more observations there are, the higher the SS - so in order to compare variables with unequal numbers of observations we need to normalize SS by dividing it by df.
This is our test statistic. If you need a refresher on hypothesis testing, I cover the basics here:
But at a high level, the larger the F-statistic, the less likely it is that we observe what we observe due to random chance. In this case our null hypothesis is that the means are not statistically different – or that the variation within groups trumps the variation between groups.
In our example, to be significant at the 5% level, we need to see an F-statistic of 2.90 or more. Since our F-statistic is just 0.924, we fail to reject the null hypothesis. A more intuitive way to think about our conclusion is by looking at the p-value (which is derived from the F-statistic) of 0.44 – a p-value of 0.44 means that there is a 44% chance of observing what we did due to random chance. That’s pretty high! No wonder we failed to reject the null hypothesis.
Understand The Sources Of Variation
Now that we got the technical stuff out of the way, let’s focus on intuition for a bit. Dividing SS_between/SS_total = 0.00677/0.08493 = 8%
. This means that a mere 8% of the total variation is explained by season. So season is not a big driver of energy stock returns.
A good question is "how high does SS_between have to be relative to SS_total (which doesn’t change) for the difference in means to be statistically significant? And what does that mean in terms of variation explained?
To hit a 2.90 F-statistic, SS_between would need to be 0.0182, almost 3 times higher. In this case, season would explain 21.4% of the total variation. So even when the means are found to be statistically different, there can still be significant variation within each bucket.
Conclusion
Let’s finish things off with a box plot – my favorite way to visualize variations across category buckets.

We can see that while spring returns are higher, the overall distributions for all the seasons more or less sit on top of each other – just as we found when we failed to reject the null hypothesis.
ANOVA is a great tool for helping you understand the drivers of observed variance. For example, if you are trying to understand what drives the differences in how much shoppers spend, you can sort the shoppers into buckets and test the quality of the sort with ANOVA. This is like a reverse cluster analysis. In clustering, you sort first and then try to figure out what the clusters mean. With ANOVA, you hypothesize that a certain categorization matters, then you test to see if you are right.
It might be a bit more tedious, but the results that come out will definitely be more interpretable (and easier to explain to your boss). Cheers!