How Pandemic Has Affected College Scores: Analysis On Real Dataset

A deep dive into exploring my college scores and finding trends in it

Kaustubh Gupta
Towards Data Science

--

The coronavirus pandemic has affected a lot of people around the world. All types of businesses have crashed, people are shifting to different domains and most of the industry is shifting to online mode. As the online mode is so common nowadays, the education area is no exception. I live in India and here, this is the first time all the educational institutes are opting for online teaching methodology. This system is very new for us and it will take time to fully adapt to this environment. Recently my 4th-semester grades were released and I was shocked when I compared the stats of this semester to the previous semesters. Want to know what I found? Let’s start digging the data!

Photo by Giorgio Tomassetti on Unsplash

About the Data

How do the Data look?

One of the best things about my University is that it releases the results in the form of open PDFs that are accessible to everyone visiting their website. But here is the catch, the PDFs have multilayered, multilevel tables that are difficult to process. Even if you look at the table and try to infer your final score, it’s a tedious process as one needs to refer multiple pages from multiple PDFs to get the desired value. Here is a screenshot of what the PDF looks like:

Some of the details have been hidden due to privacy issues. Photo by Author

If you are still curious about viewing the actual PDF, head over to this link and download any semester result you want to see!

An Important thing

I will not explain the data extraction process as the parser is still under development and the source code is not public yet. It is a long process as the files can range from 100 to 400 pages and a lot of preprocessing has to be done to obtain the best results. I have made a simplified result portal based on these PDFs. You can check that here (it’s a Heroku app, may take a minute or so to boot up). The University gives out results of all the colleges affiliated under it in the different files but at the same time. I have combined the Data of all colleges to make one file for each semester to simplify the comparison. Each file has around 5.9k entries and here we will compare 4th-semester results with previous semesters.

Abbreviations:

Common Branches in India:

  • IT: Information Technology
  • CSE: Computer Science and Engineering
  • ECE: Electronics and Communication Engineering
  • EEE: Electrical and Electronics Engineering
  • MAE: Mechanical and Automation Engineering

How many students are enrolled in different branches?

The first question we are considering is pretty straightforward and I will answer this question by presenting the Data in the form of a bar plot.

Students in Different Branches

Here is a fun fact, India Produces 25% of the world’s engineers every year but lacks researchers that drive innovation. Here, every student wants to purse CSE, only for the sake of a salary hike, whether they are skilled enough or not. It is evident from this bar chart also that shows students distribution in a fairly small University that around 2k students have opted for CSE (I have opted IT if you want to know 😀) followed by ECE and MAE is the least favored branch here. You must be thinking what is the relation of this with Pandemic? This is the point where I am making a prediction. As new admissions are yet to be started and I have no Data for this session but seeing the current scenario, I believe that CSE and IT will have another peak because most of the students are using gadgets extensively and everybody will be genuinely interested in technology rather than any other external factor and maybe this will increase researchers.

Average SGPA and Percentage

SGPA stands for semester grade points and this is the most common criterion while filtering the job resumes in the initial stage. Let’s look at the comparison where 3rd-semester SGPA is compared with 4th-semester SGPA:

Here all the colleges average lie in the respective semester range
Here all the branches average lie in the respective semester range

Surprised, confused? I was too! Let me explain what’s going on. The graph you are looking at is seaborn-distplot which shows the average of colleges or branches SGPA lying within the range of distribution and the graphs shows data for both 3rd and 4th semester. To have a better understanding of this, here is the branch SGPA of 4th semester in actual numerical form:

Data by Author

How did this shift of the SGPA range happen in the 4th semester? The answer is very simple. It happened because during this semester, the Coronavirus was at its peak (it’s still at peak ☹️) and all classes were suspended. During this time, the University tried to come up with different solutions to evaluate students for this semester as physical exams were not feasible and online exams had its own difficulties. Then it was decided that students will be evaluated 50% on the 3rd semester and 50% on internal assessment which was conducted via telephonic calls and some Google Quizzes. It is strange that at the college level when 3rd-semester SGPA was between 5.5 to 7.5 then how come 4th-semester had a range of 6.5 to 9? To prove my point I went back and compared these semesters with 2nd semester and the results were truly shocking:

Comparison of 2nd-semester SGPA with 3rd-semester
Comparison of 2nd-semester SGPA with 4th-semester

From the above two graphs, we can say that the evaluation method for 4th-semester was very too lenient for students. If a student has scored 8 SGPA then it’s not a big deal compared to scoring this in 2nd semester. If we look at the percentage trends then it is very similar to what we found in the SGPA:

Comparison of 2nd-semester Percentage with 4th-semester
Comparison of 3rd-semester Percentage with 4th-semester

Let’s assume that University did a mistake taking up this route of evaluation and ended up giving extra marks to all the students, but when I think from the student perspective (which I am!), it’s absolutely amazing to know that small efforts gave a good amount of results. In the long run, this won’t be helpful and it all comes down to one conclusion why India has a high unemployment rate.

The Ultimate Toppers

The 4th-semester proved to be a bane for students. Their low CGPA was boosted high when this semester SGPA got added to it. As every group has a leader, toppers are similar to this and these are exceptional people who somehow perform well in all the scenarios. Given the current scenario, I thought that this semester, we would not have this trend but my thought was flipped when I saw the filtered results:

Branch wise number of students scoring perfect 10 SGPA

A maximum of 60 students of a particular branch got perfect 10 this semester! I tried this with 3rd-semester and got no results! None of them ever scored perfect 10 so I tweaked the logic to get SGPAs greater than 9 for 3rd semester and here it is:

Branch wise number of students scoring SGPA greater than 9

Even after these criteria, only 45–48 students got SGPA greater than 9!

Final Note

This is an alarming situation as if we give students so much freedom in marking, then they will take these classes for granted and will never work hard to achieve a good score by actually performing something. This is similar to a condition we call in the model building as False Positives, the students are falsely given marks when they deserved less (comment down below if you want a detailed explanation of confusion matrixes). From a student’s perspective also, we can never be satisfied with the scores that we haven’t achieved and I think a better evaluation method is needed to replace this faulty system.

That’s all for this article I hope I was able to convey my analysis in the best way possible. Follow me on medium to get updates about new articles and if you are a passionate Python developer who wants to upgrade their current skillset then make sure to check out my android app in Python series on medium. With that said, Namaste!

Linkedin, Github

--

--

Data Analysis | Python-Dev — Working on Data related Projects, Python, interested in real-world applications | 0.8M+ views