Looky, looky yonder Where the sun done gone
Moby – The Last Day
When I was in high school, I clearly remember asking my history teacher, "What if the Roman Empire hadn’t fallen? How advanced would our technology be today?" She didn’t particularly appreciate my question. In fact, historians often express reservations about "what if" questions, sometimes referred to as counterfactual history. They prefer to interpret and explain events as they occurred, not as they might have happened. Their work is grounded in facts, sources, and evidence, and "what if" scenarios can potentially lead to conjecture or speculation, detracting from the rigorous analysis of historical realities.
As an introspective daydreamer in my teenage years, I kept wondering what might have happened had we not experienced the Medieval Age. Would positivist science have developed earlier? Would wars have occurred as often throughout the centuries? Would we have taken better care of our planet?
Such questions remain open because once a development occurs, it’s impossible to experience an alternate reality in which that development didn’t occur. This is essentially the fundamental problem of Causal Inference, the science behind the study of cause and effect. For instance, if the government decides to implement a policy prohibiting the consumption of alcohol, will it result in a decrease in deaths from car accidents? Ideally, this causality question would be answered by comparing the car accident death rates in our actual world, without the prohibition, and in a parallel world where the only difference is the implementation of the policy. In this ideal scenario, the effect of the policy would be the difference between the death rates observed without the policy and the death rates under the policy. Obviously, this isn’t feasible, as we only have access to our own reality.
Do incentives to mayors improve education?
I have always been interested in the dynamics of education and public policy, particularly how they can interplay to shape a society’s future. When it came time to choose a topic for my master’s thesis, I was keen on exploring something relevant, impactful, and grounded in real-world implications. I wanted to delve into a topic that could potentially provide insights into improving the education system in Brazil not just in theory but in practice as well. It was during this quest that I came across two intriguing educational policies implemented in the Brazilian state of Ceará.
The first policy was a tax incentive (TI) for mayors to improve municipal education. It was an innovative approach that tied municipal tax transfers to educational achievement, encouraging local governments to invest more in their education systems. The second policy was a program offering educational technical assistance (TA) to municipalities, providing them with the necessary resources to improve their educational practices.
Some descriptive plots suggested that Ceará was improving more compared to other states even if they invested less resources as the plot below shows. The y-axis shows the positive score change of students in mathematics and Portuguese tests, while the x-axis shows the average spending in education.
To be sure the policy actually caused these improvements I had to analyze the policies deeper and once again, I came across the fundamental problem of causal inference: What if Ceará hadn’t adopted these policies? Would their educational indicators be worse? In other words, did these policies have a positive effect on educational achievement? I didn’t have a perfect counterfactual, an alternate Ceará where the policies had not been adopted. Fortunately, causal inference provides some methods to approximate counterfactuals. One of them is the Synthetic Control method.
The Synthetic Control Method
The synthetic control method is a statistical technique used primarily in evaluating the effects of policy changes or other interventions when a control group isn’t available. The principle is based on the creation of a synthetic version of the unit of interest (in this case, Ceará) by combining multiple states that didn’t undergo the policy change. This "synthetic control" serves as the counterfactual – it’s what we might expect to have happened in the unit of interest had the policy not been implemented.
To construct this synthetic control, we must select a set of states not impacted by the policy – these are often referred to as donor units. The synthetic control is then created as a weighted combination of these donor units, chosen in such a way that the synthetic control closely matches the pre-intervention characteristics of the treated unit (Ceará). Essentially, the synthetic control represents a hypothetical Ceará that did not adopt the educational policies. This explanation simply outlines the fundamental idea behind the method. For a more comprehensive understanding, please refer to "Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects" by Alberto Abadie (2021)².
Once the synthetic control is established, we compare the post-intervention outcomes of the treated unit (Ceará) and its synthetic counterpart. The difference between the outcomes of these two can be interpreted as the effect of the intervention or policy.
In the graphs below, I depict the trend of mathematics and Portuguese scores in both Ceará and the artificially constructed Ceará, unaffected by the policy. Note that the synthetic and actual trends closely align before the policy implementation but diverge significantly thereafter. According to this method, in the absence of the policy, Ceará’s scores would have followed the trajectory represented by the yellow line. The actual scores of Ceará, under the influence of the policies, are represented by the green line. The distinction between these two lines indicates a positive effect of these policies.
A combination of both policies led to a consistent 12 percent increase in Portuguese test scores in primary education and a 6.5 percent increase in lower secondary education. The results suggested that well-designed policies could make a substantial difference in educational outcomes. The findings in mathematics were not statistically significant. In my published thesis¹ I provide some explanations for why this happened.
However, my analysis also revealed an area of concern. Despite these advancements in primary and lower secondary education, upper secondary schools, which were not directly affected by the new policies but received better-prepared students from lower levels, showed no significant improvement. This finding highlighted a critical gap in policy implementation and sparked a need for further debate on extending the benefits of educational policies to upper secondary schools, as well as to other Brazilian states.
The Synthetic Control Method in R
I used the R synth library to implement the synthetic control. This library is an incredibly powerful tool for estimating synthetic controls in R. It offers two main functions:
dataprep()
: it prepares the donor pool and treated unit characteristics in matrices as well as their outcomes of interest. These matrices can then be passed tosynth()
;synth()
: optimizes the set of weights to form the synthetic unit.
The package also offers functions to plot your results in base R, but you can also prepare the data delivered by synth()
to be plotted in ggplot2, as I did above. Check the code here: https://github.com/bruno-ponne/Better-Incentives-Better-Marks
Final thoughts
Synthetic control gave me a unique opportunity to investigate the causal impact of these policies on Ceará’s educational achievement, offering a quantitative dimension to the question of "what if". With this approach, my research went beyond the realm of theoretical speculations, enabling a rigorous analysis based on data and statistical methods.
I have always believed that education is a key factor in fostering tolerance, opportunity, and democracy in developing countries. My journey using synthetic control has revealed the potential of well-designed policies to substantially improve educational outcomes. It is my hope that these findings offer policy-makers valuable insights to make informed choices for our educational future.
Articles cited:
¹Ponne, B. G. (2023). Better Incentives, Better Marks: A Synthetic Control Evaluation of the Educational Policies in Ceará, Brazil. Braz. political sci. rev., 17(1), e0005. https://doi.org/10.1590/1981-3821202300010005
²Abadie, Alberto (2021), Using synthetic controls: feasibility, data requirements, and methodological aspects. Journal of Economic Literature. 59(2), pp. 391–425.