
Background
India is a melting pot of cultures, traditions, and values. The subcontinent of 1+ billion people houses its citizens in 36 States and Union Territories. The quality of Education is affected not only by the presence of different educational boards but also by different prevalent socio-economic factors.
I grew up in a beautiful Christian brothers’ school in a quiet town. After completing school, I stayed in a couple of different places in India, which exposed me to diverse backgrounds and diverse schooling practices.
Just like any problem can be solved most effectively by working at the grass-root level, I believe that a nation can develop sustainably and steadily by improving the schools in which the leaders of tomorrow are studying.
Introduction
In this project I have compared the following factors among states to understand prevalent gaps in the Indian education system:
- Dropout rates between girls and boys
- Dropout rates in different sections of the school
- Enrollment rates between girls and boys
- Enrollment rates in different sections of the school
- Student to teacher ratio
- Presence of different factors like electricity, toilet for boys and girls, etc.
Primary school: I-V Upper primary: VI-VIII Secondary: IX-X Higher secondary: XI-XII
Analysis
Dropout and enrollment rates

Observations:
- Enrollment rates reduce as we go to higher sections of the school and it drops drastically at higher secondary to 60% from above 100% in the primary school.
- Lakshadweep and Jammu & Kashmir, two union territories, have exceptionally low enrollment compared to other states in earlier schooling sections.
- Maximum dropout rates are in secondary school and not higher secondary because the enrollment is already low.
About 15% of 85% of students who enroll in secondary school drop out. This results in only 72% of students being retained in class X. That becomes their highest educational qualification.
However, it is wrong to look at it as a point estimate. Instead, we should look at it as a range estimate: it mostly ranges between 60% and 83%. It varies a lot depending on the state.
- We also need to check if sex is a factor in enrollments and dropouts.

Observations:
- Boys have lower enrollments and higher dropouts
In rural India, parents send their boys to work as soon as they are of the age that they can earn and bring some money home. This may be the reason for slightly higher dropouts among boys. Also, maybe parents are reluctant to educate girls and instead marry them off. It is important to note that the upper range of dropout rates for girls is more than for boys. So even though boys have a higher dropout rate, girls’ dropout rates are affected by outliers.
Student to teacher ratio in different sections of the school
Observations:
- The highest student: teacher ratio is in higher secondary school.
- There is also a lack of teachers in secondary school. This dearth of qualified teachers may be due to schools being unable to attract qualified candidates with a meager salary.
- States like UP, Bihar, Jharkhand, and WB are doing poorly in this aspect.
Students receive less attention in secondary school. The situation only worsens in higher secondary school. These are important factors leading to higher dropouts in secondary and higher secondary school.
Overall picture
Different factors in schooling
Observations:
- More than 50% of states have at least 92% of schools with drinking water
- Much more schools have drinking water compared to electricity
- A very low percentage of schools have computer facility
- A higher percentage of schools have girls toilets as compared to boys toilets
Comparison of different factors among sections of the school:
The mean % of features present in different sections of school:

Observations:
There is a sharp reduction in the no. of toilets from upper primary to secondary school but not so much in the availability of drinking water. Maybe the schools do not have the infrastructure to set up hygienic toilets. Unavailability of water in the toilets may be a reason.
Also surprisingly, secondary schools have the lowest availability of electricity. Also, note that the percentage of computers drops between upper primary and higher secondary school.
Most schools in rural areas have the facilities to teach only till secondary school(class X). In these upper classes, there is a dearth of good facilities at school. The situation is aggravated by students dropping out to pursue work. As a result, there isn’t enough demand to improve the conditions in the secondary section of these schools.
Hypothesis testing
We have done an exploratory analysis of the data. Through hypothesis testing, we can obtain conclusive results.
Dropout
Sex



We do a t-test to obtain:
t, p = (-0.9810354738258853, 0.3299538292807268)
Therefore we fail to reject the Ho.
The dropout rates between girls and boys are not different.
Sections of school

We do an ANOVA test to obtain:
F-Statistic=45.282, p=0.000
Therefore we reject the Ho.
There is a significant difference in dropout rates among different sections of the school.
Enrollment
Sex



We do a t-test to obtain:
t, p = (0.8775733057976992, 0.38317741633598124)
Therefore we fail to reject the Ho.
The enrollment rates between girls and boys are not different.
Sections of school

We do an ANOVA test to obtain:
F-Statistic=58.335, p=0.000
Therefore we reject the Ho.
There is a significant difference in enrollment rates among different sections of the school.
Student to teacher ratio

We do an ANOVA test to obtain:
F-Statistic=12.850, p=0.000
Therefore we reject the Ho.
There is a significant difference in student: teacher ratio among different sections of the school.
Other factors

We do χ2 test to assess independence between the section of the school and the presence of different factors.
The results of the χ2 test:
p = 0.0054 at 12 degrees of freedom.
Therefore we reject the Ho.
The presence of different factors is dependent on the section of the school.
Summary of hypothesis tests
- Dropout and enrollment rates do not differ between boys and girls.
- Dropout and enrollment rates differ in different sections of the school.
- Student: teacher ratio is different in different sections of the school.
- The presence of different factors is dependent on the sections of the school.
Visualizing performance of states
We can visualize data up to 3D. I did PCA to find out the components which explain the maximum variance. This way we can shrink the dimensions while incorporating data from higher dimensions.
Test Statistics for proceeding with PCA: KMO statistic: 0.61 p(Bartlett’s test for sphericity): 0
Let us find these broader features. The factor loadings for the three principal components:

Observations:
- PC1 explains the presence of different factors in school
- PC2 explains the student: teacher ratio in different sections of the school
- PC3 explains enrollments(and dropouts indirectly) in different sections of the school
These three factors were used to create non-hierarchical clusters. Maximum silhouette score was obtained with 3 clusters(here 0=2).

Visualizing the clusters:

Interpretation
Cluster 0: The underdogs Absence of factors; Low student: teacher ratio and high enrollment rate, both of which are commendable. This cluster of states has the highest potential and we should first focus on improving the presence of factors in the schools in this cluster.
Cluster 1: The privileged The presence of factors is high and the student: teacher ratio is the lowest. In spite of conducive factors, this group does not have the highest enrollment ratio. This becomes clear when we look at the dropouts vs enrollments scatterplot. These states have much higher than average dropout rates. This may be due to socio-political, cultural, and other reasons not accounted for by the features of the school system.
Cluster 2: _Absent teachers_In this group the student: teacher ratio is very high which is reflected in the lowest enrollment ratio. We need to staff teachers in the schools in this cluster.
Conclusion
The major factors accounting for the difference among different states’ education sector performance:
- Dropouts in secondary
- Enrollments in higher secondary
- Presence of electricity and computers in the schools
Further research
The next big startup is apprehended to be an education company. I believe that instead of marketing a new product they are going to improve upon the existing infrastructure. According to the Pareto principle, 80% of benefits will come from focusing on 20% of issues. The gaps in secondary school need to be covered. We need to direct efforts towards reducing dropouts in secondary school.
Call to action
- Staffing qualified teachers in states in the third cluster:
Specific requirements:
- The government has to reduce the pay gap between the public and private schools to attract talent.
- Leading NGOs in the education sector who staff underfunded schools should be allowed to transition their fellows/interns into permanent positions.
- A lot of engineers are unemployed. They have learned a lot of maths and science. They can be reskilled to become skilled teachers. Measurable targets:

India should reach a student: teacher ratio of 19 in secondary and 23 in higher secondary.
Hypothetical timeframe: 3 years.
-
Teachers are the single most important factor for ensuring learning outcomes. Teaching quality needs to be assessed. Specific requirements: New metrics have to be established following global standards as mentioned here. Measurable targets: Teach Hypothetical timeframe: 3 years
-
Improving the school facilities by ensuring the availability of electricity in the first cluster. _Specific requirements:_Increase the % of schools having electricity Measurable targets:

Hypothetical timeframe: 3 years
These tasks are no small feat and will require a lot of work at the grassroots level in towns and rural areas.
- Understand what causes high dropout rates in the second cluster by engaging in focused group discussions with teachers and parents from these states(predominantly north-eastern states).
Data
MHRD data: https://www.education.gov.in/sites/upload_files/mhrd/files/statistics-new/ESAG-2018.pdf
Other references:
Global benchmarks:
Pupil-teacher ratio, lower secondary