Author Spotlight

“The main driver behind my writing has always been learning”

Matteo Courthoud reflects on leaving academia, his interest in causal inference, and the value of public writing

TDS Editors
Towards Data Science
10 min readFeb 14, 2023

--

In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science, their writing, and their sources of inspiration. Today, we’re thrilled to share our conversation with Matteo Courthoud.

Photo courtesy of Matteo Courthoud

Matteo is a Ph.D. student in Economics at the University of Zürich with a passion for data analysis and causal inference. He was a Data Science intern at Google and did research consulting for Amazon while pursuing his Ph.D. He regularly writes about causality at the intersection of the social sciences and computer science, trying to bridge the gap between academic research and the general public.

How did your data science journey begin?

My interest in causal inference has always been there, but it took a defined shape only in recent years. I have always been a curious person and I have always loved tinkering with numbers. Unfortunately, like most people of my generation in Italy, I did not receive any education during high school related to statistics, programming, or anything that nowadays would fall under the data science umbrella. I only had a couple of programming classes in Pascal, which made one thing certain when selecting my university major: I didn’t want to write code ever again. On the other hand, I was torn between my interest in human interactions and decision-making and my love for numbers. So, despite my main two options being sociology and math, I picked economics as a fair trade-off.

In retrospect, was it the right choice for you?

I was lucky. I had only a vague idea of what economics was about, and it ended up being exactly what I was looking for: a scientific approach to studying human interactions. Some people despise economics because it translates everything into monetary values, but I think that, paraphrasing Churchill, “money is the worst measure of human incentives, except for all those other forms that have been tried from time to time”. If we want to take a scientific approach to human interactions, we need to quantify incentives, and money serves exactly that purpose, explicitly or implicitly, whether we like it or not.

Despite that, when I studied economics, the curriculum did not include as much data science as I think it should have (and does now). In terms of US education, I was basically forced to have a law, accounting, management, or finance minor with almost no option of investing more in computer science, informatics, or data science. As an undergrad, I was only taught Excel, and, during my master’s, I used Stata but there were no courses for Python or R. Moreover, the econometrics classes included mostly theory and algebraic proofs rather than programming and data analysis. Therefore, my first hands-on approaches to data science were spontaneous and self-taught. Luckily, it was the birth period of MOOCs and open lectures were gradually becoming available, often for free. Initially, I was just scraping the internet for movie information or local hiking trails. Then, I switched to more complex projects, like support bots for online card games, exploiting Bayesian dynamic decision-making models.

How did you decide on your next steps?

The decision to go to grad school was a very tough one. After my master’s degree, I applied for grad school and received several offers, but I also got an offer for a position as a data scientist in a unicorn startup. I knew I loved numbers and I wanted to work with data, but I was also profoundly unsatisfied with my education. There were too many things I wanted to know and didn’t. In the end, curiosity prevailed and I went to grad school.

As a doctoral student in economics, I received rigorous and in-depth training in causal inference. I also attended my first-ever classes in programming, data science, and machine learning. However, I started my Ph.D. with a very precise research focus: antitrust and competition policy. Competition policy is the study of strategic interactions across firms and between firms and consumers. I was drawn to the topic for two main reasons. First, it seemed to me the branch of economics in which the agent’s rationality assumption made more sense. Firms directly or indirectly try to maximize profits and are (arguably) more rational than single individuals in their decision-making. Moreover, I wanted to do impactful research with clear policy applications, and in recent years there has hardly been a more active policy field than antitrust.

However, over the years, it became clear that there was no data available for an academic to study the questions that I was interested in. I could only study those questions from a theoretical perspective, or concentrate my attention where I could find the data, usually narrower and secondary markets. Among the two options, I found the first one more inspiring: answering relevant questions imperfectly rather than answering irrelevant ones precisely (quoting my advisor). However, the result was that my research was only theoretical and I was not working with data as much as I wanted to. Unsurprisingly, the outputs were also mediocre: many people were supportive, but nobody was enthusiastic, myself included. I had come to a dead end.

How did you move on from that impasse?

During the winter of 2022, I took a decision that had been lurking in the background for a while: leaving academia. It was not an easy decision, since leaving academia is internally perceived as a failure, especially in economics and in Europe. There is little information on the available opportunities, and the risk of “coming out” is losing professors’ attention and being downgraded to a second-class student. However, my advisor was extremely supportive and fully backup up my decision.

You started writing for a broader audience around the same time — how do you see the connection between these two developments?

Writing for Towards Data Science came as a natural consequence. First of all, I decided to apply for Ph.D. internships in economics and data science for the following summer. I had done very little work outside of academia, and I wanted to explore my options further before committing. To prepare for the interviews, I started reviewing topics from the early years of grad school and new ones that I was interested in, but that I never intersected with in my research agenda. At first, I was just writing notes for myself, mixing code and text. At a certain point, I realized that I could make these notes public and I went for the blog that had helped me the most in discovering new topics in data science.

I soon realized that technical writing was not only useful for me, but also extremely satisfying. It was a great incentive to explore topics in depth and to clarify, organize, and verbalize my thoughts. Moreover, readers appreciated my effort to translate academic research into more accessible content, and this came to me as a surprise. I was mostly writing for myself, and I didn’t expect anything out of it. I received public and private comments, mostly supportive and, if critical, constructive. It was the most satisfying moment of my Ph.D. experience. Academia, unfortunately, does not reward teaching, despite it being a significant share of a professor’s day-to-day job. There are also few incentives for reaching a broader audience, both in terms of outlets and language. The only thing that matters is writing scientific papers for academic journals. Writing for Towards Data Science was a breath of fresh air.

Having made this major change, what kinds of projects and questions do you find yourself most drawn to these days?

In terms of topics, my main interest is causal inference. I am interested in ways to exploit data to answer causal questions, whether it is through randomized experiments or observational studies. Within causal inference, my choice of topics is usually either driven by the will to master a topic I already know, or to explore a new one. Often the two are correlated: I find an advanced topic that sounds fascinating, and, after I start reading about it, I soon realize that I need to review some basic notions to fully understand it. Then, I start reviewing and writing about the basics, tackling advanced topics only in a second moment. I found this approach to work particularly well, since it allows me to be more precise on advanced topics, without cluttering the articles with information, since I can refer to my previous work.

The list of topics is ever-growing. I started with a couple of topics and, for every article that I wrote, I added two more in the pipeline. I think I now have a list of around 50 articles I would like to write. Moreover, new research is constantly being released. Twitter has historically been the social network I used to stay up to date with academic research. Most economists, statisticians, and computer scientists in academia publish and discuss their work on Twitter. On the other hand, I have recently appreciated LinkedIn as a way to stay up to date with the industry. Researchers in the industry, practitioners, and data science influencers are mostly active on LinkedIn, which has been a great source of inspiration. One of my most interesting article-writing experiences, on CUPED, was born out of a LinkedIn post discussing this established industry technique to improve the efficiency of estimators for A/B tests that I never heard of. It sounded similar to other methods I knew, so I started exploring similarities and differences.

Are there any insights you can offer to others who might be interested in public writing about similar topics?

The main driver behind my writing has always been learning. And the most effective way to learn is teaching to a curious audience. During the first year of grad school, I had the luck of studying with a brilliant friend who did not have a background in economics. They always remind me that one day I told them: “I love studying with you because you know nothing about econometrics, so you ask the most interesting questions!” It was not very delicate, but very true. There is no better way to check if you truly learned something than trying to explain it in simple terms to a curious person.

From my undergrad to my Ph.D., I have always needed to rewrite, summarize and visualize notions in order to learn them. Preparing for an exam for me always meant writing a comprehensive but short summary of everything I knew, with as many examples, metaphors, plots, and diagrams as needed. The fact that other people would use that material was not only rewarding but also an extra incentive to be more clear and more intuitive.

Once you have decided that you want to write, there is the question of how to write. My main advice, for writing as for any new activity, is: copy, smartly. Of course, it does not mean copy-paste. It means finding articles you would be proud of if you had written them, and analyzing them to understand what you like the most and in which dimensions they differ from your writing style. How do they open? What’s the narrative flow? How much detail do they go into? For a person who comes from academia, the most problematic thing is to translate notions that are learned from technical papers to notions that are intuitive to anyone. It is a hard translation process that we are not trained for. The opposite: we are trained to remove anything superfluous and stick to the bare minimum to make the text machine-readable.

Are there any changes or developments you hope to see in your field over the next year or two?

I think in the future, causal inference will become more and more central and we will see a convergence between the theoretical approach from the social sciences and the data-driven approach from computer science.

On the one hand, computer science has recognized that brute-force prediction is useless to inform out-of-sample decisions (as all decisions are). Prediction algorithms are still important since causality can be framed as a prediction problem for counterfactuals. To assess policy impact, we want to predict what would have happened without that policy, and to test a new policy, we want to know what will happen with and without it.

In both cases, we want to predict something that has never happened, or, in data science terms, we do not have an appropriate training dataset. Marginal in-sample prediction improvements might bring more out of sample noise than signal. Data science recognized this problem a long time ago and has developed many methods to flag it, but not to solve it. Sample splitting has been the first step in this direction (incredibly overlooked in the social sciences): it tells you when something does not work but not why or how to fix it. The same goes for all the “data-drift detection” techniques: they are informative but not constructive. All the recent stress on data quality over quantity is another symptom of the growing interest in causality in computer science.

On the other hand, the social sciences are slowly internalizing the fact that there is a lot of data out there, and this data can be useful to answer causal questions. In particular, whenever we can frame a causal question as a prediction problem, there is the opportunity to borrow from machine learning literature. The main applications have been doubly-robust machine learning for covariates selection, generalized random forests for heterogeneous treatment effects, matrix completion methods for potential outcomes, and reinforcement learning for dynamic decision-making. The main trade-off concerns the fact that most of the (interesting) data is private and not available for research. Besides significantly hampering scientific inquiry in some fields (see competition policy above), it has also lowered the incentives to develop more data-hungry causal inference methods.

Lastly, the role of causal inference in the industry is now established and will only grow further. New tech companies have experimentation and A/B testing at their decision-making core, while older companies either adapt or lag behind. I see two main trends starting now. First, expanding experimental causal inference. Network and spillover effects can bias simple difference-in-means estimators. And even when randomized experiments provide unbiased treatment effect estimates, additional data and/or models can help in increasing the precisions of the estimates. Second, quasi-experimental methods will become more prominent since not everything can be randomized, for practical, ethical, or cost reasons. I am very excited to witness the evolution of these trends and I hope to see a convergence in causal inference methods across disciplines.

To learn more about Matteo’s work and stay up-to-date with his latest articles, follow his here on Medium, on LinkedIn, and on Twitter. For a taste of Matteo’s TDS articles, here are several standouts from our archive:

Feeling inspired to share some of your own writing with a wide audience? We’d love to hear from you.

This Q&A was lightly edited for length and clarity.

--

--

Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly/write-for-tds