The world’s leading publication for data science, AI, and ML professionals.

Analysis of the Indian education system

Comparison among different states on several factors in the education space

Source
Source

Background

India is a melting pot of cultures, traditions, and values. The subcontinent of 1+ billion people houses its citizens in 36 States and Union Territories. The quality of Education is affected not only by the presence of different educational boards but also by different prevalent socio-economic factors.

I grew up in a beautiful Christian brothers’ school in a quiet town. After completing school, I stayed in a couple of different places in India, which exposed me to diverse backgrounds and diverse schooling practices.

Just like any problem can be solved most effectively by working at the grass-root level, I believe that a nation can develop sustainably and steadily by improving the schools in which the leaders of tomorrow are studying.

Introduction

In this project I have compared the following factors among states to understand prevalent gaps in the Indian education system:

  • Dropout rates between girls and boys
  • Dropout rates in different sections of the school
  • Enrollment rates between girls and boys
  • Enrollment rates in different sections of the school
  • Student to teacher ratio
  • Presence of different factors like electricity, toilet for boys and girls, etc.

Primary school: I-V Upper primary: VI-VIII Secondary: IX-X Higher secondary: XI-XII

Analysis

Dropout and enrollment rates

Image by author
Image by author

Observations:

  • Enrollment rates reduce as we go to higher sections of the school and it drops drastically at higher secondary to 60% from above 100% in the primary school.
  • Lakshadweep and Jammu & Kashmir, two union territories, have exceptionally low enrollment compared to other states in earlier schooling sections.
  • Maximum dropout rates are in secondary school and not higher secondary because the enrollment is already low.

About 15% of 85% of students who enroll in secondary school drop out. This results in only 72% of students being retained in class X. That becomes their highest educational qualification.

However, it is wrong to look at it as a point estimate. Instead, we should look at it as a range estimate: it mostly ranges between 60% and 83%. It varies a lot depending on the state.

  • We also need to check if sex is a factor in enrollments and dropouts.
Image by author
Image by author

Observations:

  • Boys have lower enrollments and higher dropouts

In rural India, parents send their boys to work as soon as they are of the age that they can earn and bring some money home. This may be the reason for slightly higher dropouts among boys. Also, maybe parents are reluctant to educate girls and instead marry them off. It is important to note that the upper range of dropout rates for girls is more than for boys. So even though boys have a higher dropout rate, girls’ dropout rates are affected by outliers.

Student to teacher ratio in different sections of the school

Observations:

  • The highest student: teacher ratio is in higher secondary school.
  • There is also a lack of teachers in secondary school. This dearth of qualified teachers may be due to schools being unable to attract qualified candidates with a meager salary.
  • States like UP, Bihar, Jharkhand, and WB are doing poorly in this aspect.

Students receive less attention in secondary school. The situation only worsens in higher secondary school. These are important factors leading to higher dropouts in secondary and higher secondary school.

Overall picture

Different factors in schooling

Observations:

  • More than 50% of states have at least 92% of schools with drinking water
  • Much more schools have drinking water compared to electricity
  • A very low percentage of schools have computer facility
  • A higher percentage of schools have girls toilets as compared to boys toilets

Comparison of different factors among sections of the school:

The mean % of features present in different sections of school:

Observations:

There is a sharp reduction in the no. of toilets from upper primary to secondary school but not so much in the availability of drinking water. Maybe the schools do not have the infrastructure to set up hygienic toilets. Unavailability of water in the toilets may be a reason.

Also surprisingly, secondary schools have the lowest availability of electricity. Also, note that the percentage of computers drops between upper primary and higher secondary school.

Most schools in rural areas have the facilities to teach only till secondary school(class X). In these upper classes, there is a dearth of good facilities at school. The situation is aggravated by students dropping out to pursue work. As a result, there isn’t enough demand to improve the conditions in the secondary section of these schools.

Hypothesis testing

We have done an exploratory analysis of the data. Through hypothesis testing, we can obtain conclusive results.

Dropout

Sex

We do a t-test to obtain:

t, p = (-0.9810354738258853, 0.3299538292807268)

Therefore we fail to reject the Ho.

The dropout rates between girls and boys are not different.

Sections of school

We do an ANOVA test to obtain:

F-Statistic=45.282, p=0.000

Therefore we reject the Ho.

There is a significant difference in dropout rates among different sections of the school.

Enrollment

Sex

We do a t-test to obtain:

t, p = (0.8775733057976992, 0.38317741633598124)

Therefore we fail to reject the Ho.

The enrollment rates between girls and boys are not different.

Sections of school

We do an ANOVA test to obtain:

F-Statistic=58.335, p=0.000

Therefore we reject the Ho.

There is a significant difference in enrollment rates among different sections of the school.

Student to teacher ratio

We do an ANOVA test to obtain:

F-Statistic=12.850, p=0.000

Therefore we reject the Ho.

There is a significant difference in student: teacher ratio among different sections of the school.

Other factors

We do χ2 test to assess independence between the section of the school and the presence of different factors.

The results of the χ2 test:

p = 0.0054 at 12 degrees of freedom.

Therefore we reject the Ho.

The presence of different factors is dependent on the section of the school.

Summary of hypothesis tests

  • Dropout and enrollment rates do not differ between boys and girls.
  • Dropout and enrollment rates differ in different sections of the school.
  • Student: teacher ratio is different in different sections of the school.
  • The presence of different factors is dependent on the sections of the school.

Visualizing performance of states

We can visualize data up to 3D. I did PCA to find out the components which explain the maximum variance. This way we can shrink the dimensions while incorporating data from higher dimensions.

Test Statistics for proceeding with PCA: KMO statistic: 0.61 p(Bartlett’s test for sphericity): 0

Let us find these broader features. The factor loadings for the three principal components:

Observations:

  • PC1 explains the presence of different factors in school
  • PC2 explains the student: teacher ratio in different sections of the school
  • PC3 explains enrollments(and dropouts indirectly) in different sections of the school

These three factors were used to create non-hierarchical clusters. Maximum silhouette score was obtained with 3 clusters(here 0=2).

Visualizing the clusters:

Mean values for different clusters
Mean values for different clusters

Interpretation

Cluster 0: The underdogs Absence of factors; Low student: teacher ratio and high enrollment rate, both of which are commendable. This cluster of states has the highest potential and we should first focus on improving the presence of factors in the schools in this cluster.

Cluster 1: The privileged The presence of factors is high and the student: teacher ratio is the lowest. In spite of conducive factors, this group does not have the highest enrollment ratio. This becomes clear when we look at the dropouts vs enrollments scatterplot. These states have much higher than average dropout rates. This may be due to socio-political, cultural, and other reasons not accounted for by the features of the school system.

Cluster 2: _Absent teachers_In this group the student: teacher ratio is very high which is reflected in the lowest enrollment ratio. We need to staff teachers in the schools in this cluster.

Conclusion

The major factors accounting for the difference among different states’ education sector performance:

  • Dropouts in secondary
  • Enrollments in higher secondary
  • Presence of electricity and computers in the schools

Further research

The next big startup is apprehended to be an education company. I believe that instead of marketing a new product they are going to improve upon the existing infrastructure. According to the Pareto principle, 80% of benefits will come from focusing on 20% of issues. The gaps in secondary school need to be covered. We need to direct efforts towards reducing dropouts in secondary school.

Call to action

  • Staffing qualified teachers in states in the third cluster: Specific requirements:
    1. The government has to reduce the pay gap between the public and private schools to attract talent.
    2. Leading NGOs in the education sector who staff underfunded schools should be allowed to transition their fellows/interns into permanent positions.
    3. A lot of engineers are unemployed. They have learned a lot of maths and science. They can be reskilled to become skilled teachers. Measurable targets:
Student: teacher ratio for India vs the world
Student: teacher ratio for India vs the world

India should reach a student: teacher ratio of 19 in secondary and 23 in higher secondary.

Hypothetical timeframe: 3 years.

  • Teachers are the single most important factor for ensuring learning outcomes. Teaching quality needs to be assessed. Specific requirements: New metrics have to be established following global standards as mentioned here. Measurable targets: Teach Hypothetical timeframe: 3 years

  • Improving the school facilities by ensuring the availability of electricity in the first cluster. _Specific requirements:_Increase the % of schools having electricity Measurable targets:

Hypothetical timeframe: 3 years

These tasks are no small feat and will require a lot of work at the grassroots level in towns and rural areas.

  • Understand what causes high dropout rates in the second cluster by engaging in focused group discussions with teachers and parents from these states(predominantly north-eastern states).

Data

School Education in India

MHRD data: https://www.education.gov.in/sites/upload_files/mhrd/files/statistics-new/ESAG-2018.pdf

Other references:

https://www.indiatoday.in/education-today/news/story/15-initiatives-taken-by-central-government-to-improve-teaching-standards-in-india-hrd-minister-1556357-2019-06-26

Global benchmarks:

Pupil-teacher ratio, primary

Pupil-teacher ratio, lower secondary

Pupil-teacher ratio, secondary

Pupil-teacher ratio, upper secondary


Related Articles