The world’s leading publication for data science, AI, and ML professionals.

Read with Me: A Causality Book Club

Starting from a cat story…

Photo by Humberto Arellano on Unsplash
Photo by Humberto Arellano on Unsplash

Read with Me: A Causality Book Club

I have three cats. I love all of them, but I have to admit they have different intelligence levels. The smartest one’s name is MaoMao. Recently, I noticed MaoMao has picked up a new habit. Whenever I am eating lunch, he sits next to me. I’d like to believe he loves me and wants to spend time with me, but his aloof personality suggests otherwise. I also notice he is alert when sitting next to me, constantly looking for something. After some observations, I realize he is waiting for a sunlight reflection to chase. You know how cats are crazy about lasers, light reflections, etc. The dining table receives full sunlight at noon, and sometimes, the reflection of my watch appears on nearby walls or ceilings.

image by author
image by author

Maomao has picked the pattern – – whenever I sit by the desk, there would be a light to chase. My other two cats haven’t figured out this pattern yet but have their own strategy. Whenever they hear the noises made by MaoMao chasing the light, they know the light has come and will soon join the chase. I watch them play every time when I am having lunch, the most relaxing activity recovering from busy mornings.

You may wonder, is this a technical blog about causality or a cat story? Although research shows watching cats play or doing stupid things reduces stress levels significantly, that’s not the main purpose of this article. The story is just a way for me to bring up the importance of causality in a more understandable way. Moreover, I am excited to introduce my first ‘Read with Me‘ series, where I invite you to read a book together with me. I hope to provide a platform for us to delve deeper into our common interests and share our learnings. The book I’d like to start with is ‘The Book of Why’ by Judea Pearl, which "revolutionized the understanding of causality."

I am deeply interested in causality, not only because I use it intensively at work but also because I believe it to be true science. You probably heard about ‘correlation is not causality‘ a lot. However, the real question is, why chase causality when correlation can already do much for us? That’s where the cat story comes in handy.


Correlation-based Model vs. Causal Structures

In MaoMao’s mind, he is interested in playing with the light reflection, and believe it or not, he has a forecasting model in his mind. The model is whenever I am sitting at the dining table, the light shows up. The model can be represented in math equations:

P(Light) = 1, if I sit at the dining table
P(Light) = 0, otherwise

So whenever he observes me sitting at the dining table, he is ready to chase some light. A lot of the time, he is right. If I calculate the accuracy of his forecasts. He is right 99% of the time whenever it’s sunny. The 1% inaccuracy may come from the occasions when I am not wearing my watch. My other cats, who follow MaoMao’s behavior, are represented by these equations:

P(Light) = 1, if MaoMao is chasing it
P(Light) = 0, otherwise

They would also have similar accuracy. However, they are slower and always miss out on a head start in the light fight.

The two types of the cat forecasting model probably summarized all correlation-based forecasting models. The models learn the patterns from historical data and find the most predictive variables to the target variable. In my cat’s mind, the two models are much easier than the machine learning and deep learning models developed by smart human beings. However, they follow the same principle with different levels of complexity.

The system works! For a lot of times! My cats always get their daily exercises around my lunchtime as long as I am wearing a watch and sitting at the right location in the right weather. However, on cloudy days, at dinner time when the sun is down, or when I am not wearing a watch, their forecasting models’ accuracies will all drop to 0. Why? Because they haven’t solved the problem the right way by learning the causal structure:

image by author
image by author

What’s in their mind and their mental model is:

image by author
image by author

Even though we humans can tell easily the structure is wrong, my cats are still staying with their forecasting models. MaoMao sits with me whenever I have a meal, even at night. That’s the problem of not knowing the causality when making forecasts. When everything stays the same, or there is no regime change, the correlation-based model works so well that people doubt the necessity of figuring out a causal structure. Then things like COVID, geopolitical frictions, and business modifications happen, and suddenly, the model performance drops crazily, and you are scratching your head about what’s happening here. In such cases, the only choice you have is to retrain and rebuild your models to adjust to the new regime, usually with limited training data available. Another scenario is worse when you resemble my other cats, constantly making lagged model predictions. In such cases, you are just reacting instead of being proactive, which is never the purpose of forecasting models, and you are always late.


How can causality help?

If MaoMao understood the causal relationship between the sun and my watch, causing light to appear on the wall, it would know exactly where to look for the light on a sunny day. Additionally, MaoMao would know there is no point in waiting for the light on a rainy day or when I am not wearing a watch. With the same principle, if I sit on the sofa with sunlight, the light will also appear on the opposite wall; if I don’t wear a watch but have a phone in my hand, the phone will also reflect the light on the wall. Moreover, once they figure the structure out, they can try to influence my behavior through their cuteness to lead me to sit in places with sunlight. They can unleash the full causality potential to their benefit, which is to play with the light. Isn’t it nice to be a cat who understands causality?

Of course, our forecasting models are much more complicated than the cats’ model. We need to build and construct much more complicated causal structures facing numerous features. However, the payoff is also much more than a play date. It’s the million-dollar question that every company wants to know – If I do X, will my sales go up? Will my customers churn less? Will my profit be higher?

That’s the question Judea Pearl wants to help us answer from "The Book of Why." In the introduction, he explained the ladder of causality.

Refer to Judea Pearl's "The Ladder of Causality"
Refer to Judea Pearl’s "The Ladder of Causality"

The first rung makes predictions based on passive observations, it answers questions like "If I see my customer buy a toothpaste, how likely will I see them buy dental floss?"

The second rung is based on intervention, which is to go beyond seeing and change what it is. A typical question at this layer is, " What will happen to sales of dental floss if we double the price of toothpaste?" or more directly, "What’s the price we should set to sell the most dental floss?"

The third rung is about counterfactuals. It is at the top of the layer because it involves imagination, something manifested during the Cognitive Revolution that distinguished human beings from animals. A typical question here is, "What would have happened to the dental floss sales if I hadn’t increased the price of toothpaste?"


Human-like Intelligence or Animal-Like Abilities

Photo by Phillip Glickman on Unsplash
Photo by Phillip Glickman on Unsplash

From walking straight to uncovering the power of fire, from Pearson correlation to complex deep neural networks to LLM, human makes innovations nonstop. We have made a lot of progress in building predictive AI models to resemble human intelligence, and these models equip machines with impressive abilities. However, no intelligence will be achieved without embedding causality. We can create more complex models with multiple hidden layers and mathematical equations, but the fundamental principles remain similar to my cats’ simple model. It’s the same correlation-based first-rung solution to complicated questions.

With that, I will kick off the Causality Book Club and invite you to read "The Book of Why" with me. I plan to read one chapter a week and update an article every two weeks on two chapter’s content. The article can be my notes from reading the chapter or something useful and related to the content. I highly encourage you to read the book if time permits. If not, stay tuned for my summaries and learnings. There are ten chapters in this book. With this pace, we can probably finish this book by the end of this year. Here are the articles I have written so far:

Read with me by subscribing to my email list. Feel free to share your learning in the comment below. Also, it’s my first time trying a Read with Me series, so let me know if you have any suggestions. I also highly encourage you to start your own blog to write down your learnings or thoughts throughout the process. As stated in my latest YouTube video about how I benefit from writing at Medium, writing has provided me with a more fruitful journey than I had anticipated. This could also be your opportunity to try it out.


Thanks for reading. Hope my cat story inspires you to learn more about causality and join the journey with me. If you like this article, don’t forget to:

  • Watch other creations I engaged:

Reference

The Book of Why by Judea Pearl

The Ladder of Causality Photos:

_[1] Robot Photo by Rock’n Roll Monkey on Unsplash;_

_[2] Cat Photo by Raoul Droog on Unsplash;_

_[3] Intervention Photo by British Library on Unsplash;_

_[4] Human Playing Chess Photo by JESHOOTS.COM on Unsplash;_


Related Articles