The world’s leading publication for data science, AI, and ML professionals.

How to Hyper-Learn Data Science

Tips to be an efficient and a fast learner

Photo by Marc-Olivier Jodoin on Unsplash
Photo by Marc-Olivier Jodoin on Unsplash

I wanted to write this article because I’ve gotten a lot of questions around how to approach learning Data Science, and I understand the struggle. It’s really overwhelming at the start, especially when you find out that you have to learn programming, statistics, mathematics, and so on. It seems like the list is endless, but trust me when I say that it’s not as bad as you think.

My goal is two-fold:

  1. I want to smoothen your learning journey by giving some direction and tips
  2. I want to share with you my tips that helped me learn at a faster pace
Image Created by Author
Image Created by Author

With that said, let’s dive right into it!


1) How much you learn is determined by two variables…

First, you should know that "learning" refers to the acquisition of both knowledge and skills. So when I say "learn", I am referring to learning theory (knowledge) and learning how to apply that knowledge (skills).

This isn’t rocket science, but there are two main factors that play into how much you learn in a given time period:

  • Time invested: Again, this isn’t rocket science. If you’re spending 2 hours a day learning data science instead of 1 hour a day, you can sift through twice as much material and spend twice as much time applying your skills (i.e. Programming).
  • Amount retained: Learning skills is one thing, but retaining them is another. You might have heard of the curve of forgetting. Simply put, you need to be consistent in learning data science and practicing what you learn.

Personally, I think one of best decisions I made was committing to learning and writing about anything related to data science once a week for 52 weeks because it forced me to invest a good amount of time and be consistent.


2) Start with the fundamentals

If you’ve read my previous articles, I probably sound like a broken record at this point, but starting with the fundamentals will go a long way. It might feel like it’s the slower route, but this will allow you to learn more complex concepts that build on these fundamentals in the future.

The fundamentals that I recommend learning are:

  • Statistics and Probability: Data science and machine learning are essentially a modern version of statistics. By learning statistics first, you’ll have a much easier time when it comes to learning machine learning concepts and algorithms.
  • Calculus and Linear Algebra: Like statistics, many data science concepts build on fundamental mathematical concepts. In order to understand cost functions, you need to know differential calculus. In order to understand hypothesis testing, you need to understand integration. And to give more one more example, linear algebra is essential to learning deep learning concepts, recommendation systems, and principal component analysis.
  • Programming (Python, SQL): SQL is arguably the most important skill to learn across any type of data-related profession, whether you’re a data scientist, data engineer, data analyst, business analyst, the list goes on. As for Python, it seems to be the main scripting language used by data scientists (and I know nothing about R).

You don’t have to learn everything about the topics above, but you should definitely know the fundamentals before diving into machine learning and deep learning. This leads me to my next point…

Check out my article below if you’d like some resources to learn these:

A Complete 52 Week Curriculum to Become a Data Scientist in 2021


3) Don’t try to memorize everything

It’s one thing to understand what you learn, but it’s another to try to memorize everything. Especially when it comes to SQL, Python, and Pandas, don’t feel like you have to learn every function and method that they have to offer. Instead, focus on learning how to Google the right questions when programming.

I’ve talked to veterans in the data science community and I haven’t met a single person that has memorized every SQL and Python function. It’s an inefficient use of time and can better spent on other things like building projects!


4) Learn by "doing"

As I alluded to earlier, you’ll learn and retain more knowledge and skills by doing rather than just studying. Similar to how you do homework after you learn a new concept in school, you need to constantly apply what you learn to projects.

And don’t worry about completing complex projects. Even something as simple as conducting exploratory data analysis on a dataset will help you accelerate your learnings.

Here are some ideas to get you started:

Idea 1: SQL Case Study

Link to the case.

The objective of this case is to determine the cause for a drop in user engagement for a social network called Yammer. Before diving into the data, you should read the overview of what Yammer does here. There are 4 tables that you should work with.

The link to the case above will provide you with much more detail pertaining to the problem, the data, and the questions that should be answered.

Check out how I approached this case study here if you’d like guidance.

Idea 2: Trustpilot Webscraper

Learning how to webscrape data is simple to learn and extremely useful, especially when it comes to collecting data for personal projects. Scraping a customer review website, like Trustpilot, is valuable for a company as it allows them to understand review trends (getting better or worse) and see what customers are saying via NLP.

First I would get familiar with how Trustpilot is organized, and decide upon which kinds of businesses to analyze. Then I would take a look at this walkthrough of how to scrape Trustpilot reviews.

Idea 3: Titanic Machine Learning Competition

In my opinion, there’s no better way of showing that you’re ready for a data science job than to showcase your code through competitions. Kaggle hosts a variety of competitions that involves building a model to optimize a certain metric, one of them being the Titanic Machine Learning Competition.

If you want to get some inspiration and guidance, check out this step-by-step walkthrough of one of the solutions.


Thanks for Reading!

I hope you find these tips helpful! The most important tip is that you’re consistent in your learnings – I think that prevails over your methodology of learning and what resources you use to learn. Consider everything else as smaller hyper-parameters that you can tune ;).

I wish you the best in your learning endeavors!

Not sure what to read next? I’ve picked another article for you:

A Complete 52 Week Curriculum to Become a Data Scientist in 2021

and another!

How I’d Learn Data Science if I Could Start Over (2 years in)

Terence Shin


Related Articles