The world’s leading publication for data science, AI, and ML professionals.

The Science and Art of Causality (part 1)

If we cannot directly test for causality, what should we do?

Let me take you with me on a journey in my field of expertise, which is also my passion, my obsession, what I like to call: The science and art of causality.

"Causality" refers to the relationship between cause and effect. It’s the idea that one event or action can lead to another event or outcome. In other words, causality is concerned with understanding how things happen and why they happen.

In this article we are going to see first going to answer two questions: why understanding causality is so imporant, and why it is so hard to assess causality (fundamental problem of Causal Inference). Then, we are going to see two main ways to measure a causal effect (randomized control trials and natural experiments). Finally, I will illustrate how to question causality and give you practical tool to do it.

Why causality?

First things first: Why should we care about Causality? Why am I so obsessed with this topic? Well, because every decision we make, whether as individuals or organizations, is based on the assumption that certain actions will lead to certain outcomes.

For example, if we decide to go vegan, it’s because we believe it will have health or environmental benefits. Or if a company changes its advertising strategy, it’s because it hopes to improve sales or subscriptions. Similarly, governments must consider how their actions will impact the world around us, such as whether transitioning to renewable energy sources will help meet climate goals. However, understanding causality can be complex and requires careful analysis to determine the true relationships at play. That’s where the science and art of causality comes in – it helps us examine and interpret data to better understand the world around us and make informed decisions.

The thing is, if we fail to properly assess the relationship, the causal chain, the consequences might be very costly.

The thing is, if we fail to properly assess the relationship, the causal chain, the consequences might be very costly. First, or governments might put a lot of effort and resources into getting data, and then even more resources to analyze those. But then, if the conclusion is wrong, they might lose even more resources by following a wrong pathway. Hence, that’s why it’s so important to assess causality.

Additionnaly, understanding causality also helps when we read the news, when we listen to politicians, when we discuss with others, to reduce the risk of being manipulated or suffering from misinformation.

Moreover, the good news is that there are tools that you can learn, and practical methods that you can apply daily to fight misinformation and to make better, or more educated, decisions.

Why is it so hard to assess causality?

So, actually, why is it so hard to assess causality? The thing is, you don’t have a statistical test that will tell you if your effect is causal or not. You can do many things in that direction. Numerous statistical tests can be done to challenge what we call the identifying assumptions (the assumptions to identify the causal effects), but usually, we cannot test those assumptions directly.

This is what makes understanding causality so exciting! It requires both a strong foundation in statistics and a deep domain knowledge. You have to use your critical thinking skills to consider the relationships between different variables and events. Math alone will not save you. The main challenge is often in how we interpret the statistical measures we use – we may mistake a correlation for a causal effect.

The fundamental problem of causal inference

The key aspect comes from the fundamental problem of causal inference. Let me illustrate this concept with the two graphs below. On the left, you have the global direct primary energy consumption in the world, split between renewable energies and other sources of energy production. The second graph represents the CO2 emissions of the world, also from 1900 to 2020. Both are growing. However, we tend to think that using renewable energy help to reduce CO2 emissions. Using such aggregate data, some people might be tempted to argue that renewable energies are not necessarily useful.

Obviously, it’s impossible to answer the question (What is the effect of renewable energy on CO2 emissions?) with such simple statistics. The thing is, we don’t know what will happen without renewable energies. We don’t have a world without renewable energies.

The relationship between renewable energy and CO2 emissions is complex. On one hand, renewable energy sources like solar and wind power can be used as substitutes for fossil fuels and thus lead to lower CO2 emissions. On the other hand, the process of extracting resources to produce photovoltaic cells is energy-intensive, and the "rebound effect" (where people consume more energy when it is provided by renewable sources) can also lead to higher CO2 emissions. Without more data, it is impossible to definitively answer the question of whether renewable energy leads to higher or lower CO2 emissions. Bottom line, it is impossible to answer the question with such data.

To perfectly answer this question, we would need two parallel worlds.

In one of the two worlds you have renewable energy and in the other, you don’t. And as it’s the only thing that changed between the two worlds, if there is a difference with respect to CO2 emissions it would most certainly be caused by renewable energy use.

Unfortunately, we don’t have access to parallel worlds where we can observe the same situation with and without a particular treatment or action. This creates the "fundamental problem in causal inference", as we cannot observe the counterfactual – the alternative reality in which a treatment or action was not taken. For example, we can’t observe a person who is sick both taking and not taking a medication at the same time. What we try to do in causal inference is to get as close as possible to this ideal situation, where we can compare the outcomes of the same situation with and without a particular treatment or action. This allows us to better understand the causal relationships at play.

The golden standard

Usually, the first way mentioned to address this issue, often called the golden standard, and arguably the best solution is randomized control trials (A/B testing).

In short the concept is the following. We take a sample that is, hopefully, representative of a larger population, and randomly allocate subjects between two groups (treatment and control) or more. The subjects typically do not know whether they receive the treatment or not (a process known as blinding). Therefore, the two groups are arguably comparable. Since the only difference is the treatment, if we observe an effect, it is potentially causal, provided that no other biases exist.

However, there are two main weaknesses with Randomized Control Trials (RCT). The first is that we cannot always use RCT. Sometimes it’s impossible, for example to change the sex of a subject just for the sake of an experiment. In other situations, it is unethical. For example, I have a paper assessing the effect of weapons exports on the probability of conflict in Africa. We are not going to randomly send weapons in different countries to see if it affects the probability of conflict.

The second main weakness is that usually when we control perfectly the environment for an experiment it might come at the cost of external validity (to what extend we can extrapolate the results beyond the scope of the study). For example, medical research is often done with inbred strains of rats/mice. Those animals are almost identical genetically and hence we are close to the parallel worlds situation. But again, the drawback is that then we lose external validity.

Hence, often there is a trade-off between measuring perfectly a causal effect and the extend to which the results reflects well real life situations.

Let me illustrate this idea with a fantastic paper: Parachute use to prevent death and major trauma when jumping from aircraft: randomized controlled trial (https://www.bmj.com/content/363/bmj.k5094). This paper was published in a top medical journal: the British Medical Journal (BMJ).

In this experiment (it’s a Randomized Control Trial), they managed to enroll 23 volunteers over a period of one year, between 2017 and 2018, and they made them jump from a plane. The participants were randomly assigned to two groups, either with a parachute or with an empty backpack. They measured the probability of dying or of a major physical trauma directly after impact. The authors don’t find a difference between the two groups with respect to to those outcomes (death and major traumatic injury).

So, where’s the catch? It’s a it’s a real experiment, but the thing is, to be able to do these experiments, they of course did it on a small stationary plane which never left the ground. People were jumping more or less from one meter off the ground. The idea of the paper was to highlight the fact that sometimes you try to control perfectly the environment with an experiment but then it does not really reflect reality.

Natural experiments

So then, if it’s not necessarily possible to do a randomized control trial, what can we do? Well, we can resort to what we call quasi-experimental design or natural experiments.

"Natural experiments are observational studies in which an event or a situation that allows for the random or seemingly random assignment of study subjects to different groups is exploited to answer a particular question." Britannica

Let me take an example to illustrate such experiment. Let’s say you want to assess the effect of pollution on health (e.g. respiratory diseases risk). You can use animals in a lab, but you’re not going to use humans and expose them to lethal or very dangerous level of pollution. Even if animal research migh be useful, we might struggle to extrapolate to humans. We might be interested about what happens in real life and when there is a long exposure to pollution and humans living their daily life. However, if we compare people from cities to rural areas, they are not comparables, other things make them very different, for example: what they eat, how they move, their type of work, etc.

Another option is to compare people living in cities to those living in rural areas, but these groups may be different in other ways (such as their diet, physical activity, and type of work) that could confound the results. Therefore, it can be difficult to accurately assess the causal relationship between pollution and health.

Hence, to answer such question, we can resort to a natural experiments. That is exactly what some researchers did in 2016 (https://www.sciencedirect.com/science/article/abs/pii/S0095069616300237). The authors used the Olympic Games in Beijing to measure the causal effect of air pollution on mortality. The government imposed very strict laws to reduce pollution over the city before and during the Games (e.g. closure of power plants, reduction of the use of cars). This situation allows to observe exactly the same people before, during and after. These individuals experienced a sudden change in their level of pollution exposure, going from a high level of exposure to a significantly lower one. The author found that "a 10 percent decrease in PM10 [small particulate matter in the air] concentrations reduces the monthly standardized all-cause mortality rate by 8 percent.".

How to challenge causality ?

The two main things that question causality (endogeneity issues) are that there are often other things affecting the relationship you are looking at (omitted variable bias), or the effect might go in both directions (simultaneity, or reverse causality).

As we cannot test directly for causality, what can we do?

As we cannot test directly for causality, what can we do? You can always ask if there is something else explaining this result, or if the reverse is true at the same time. Use those questions to challenge a causal statement.

First let me illustrate the first concept: Omitted Variable Bias (Is there something else?). Research has found that there is a high negative correlation between coffee consumption and cardiovascular diseases (https://academic.oup.com/eurjpc/article-abstract/29/17/2240/6704995?redirectedFrom=fulltext). There may be some reasons why the link between coffee consumption and health mentioned in the paper (Chieng et al., 2022) is not necessarily causal. One thing to consider is that other factors may be influencing both coffee consumption and overall health. For example, people more physically active might consume more coffee consumption and be more healthy due to their physical activity. It’s important to always consider the possibility that other factors may be at play and not interpret an observed relationship as a causal effect without further analysis.

The second question you should ask is: Could it be the reverse? So when you observe that people who drink more are more depressed, is it because they drink, or are they drinking because they are depressed? Research found the two effects. Hence, with this correlation it is very hard to assess the effect of each on the other, because they are of mixed together.

Conclusion

As we cannot test directly for causality, it is important to use your critical thinking to challenge any causal statement you hear or read. Start by thinking about the parallel world’s situation and try to spot what is the difference between this ideal situation and the situation you face to assess a causal effect. This usually helps to spot where things could go wrong. And then, use your two questions: Could it be something else? Is it the reverse?

To go further, read part 2 of this article.


Related Articles