Mistakes to Avoid in the Data Science Interview

How you can learn and avoid these mistakes in the future

Abhishek Pawar
Towards Data Science

--

Photo by LinkedIn Sales Solutions on Unsplash

Data Science is one of the fastest-growing domains in the technology industry. If you are looking for an entry-level Data Scientist (DS) or Machine Learning Engineer (MLE) job straight out of college, it is important to know a few common mistakes in the interview. A simple naïve mistake can reduce your chances of being shortlisted.

If you are consistently getting rejected for DS or MLE roles, you need to analyze and identify where your efforts lack. This blog is about common mistakes candidates commit in the DS or MLE interviews.

I am fortunate to interview 70+ candidates for different roles that helped me share my experience on Medium (Thanks to AlgoAnalytics)

Mistake 1: GitHub repositories without or incomplete README.md

Many entry-level DS candidates think that sharing the jupyter notebook on GitHub can make a significant impact on their profile! However, the chances are that the HR/Non-Tech Recruiter may not know what a jupyter notebook file is or how to open a jupyter notebook file?!

To showcase your hard work, please spend some time writing a high-level description of the project in the README. The ideal README can have (but is not limited to) :

  • Introduction about the problem you are trying to solve
  • The source of the dataset
  • If the data is scraped, how did you do that?
  • What baseline models were considered or used? (more on this later)?
  • What algorithms are used? What results are achieved?
  • How to reproduce the results?
  • If the app requires Docker, how to run the container?
  • If the app is deployed, a link to the app (Bonus)

Mistake 2: Broken hyperlinks on the resume!

This seems like a common check everyone does but I have seen a few broken hyperlinks on the resume. You don’t want your interviewer to see Page Not Found 404 📛 and create a weak impression before the interview!

Mistake 3: Your Machine Learning model is not deployed

The goal of Machine Learning is to solve a problem. And we can do that when the model is in production and the user/service is consuming the predictions.

Photo by Ian Taylor on Unsplash

So it is worth learning to deploy an ML model in a real-world setting. It can help you exhibit:

  • You are aware of the technologies/platforms like Docker, AWS, or Heroku
  • You can showcase your creativity with Streamlit or Gradio
  • You have the zeal to learn and implement end-to-end solutions

Mistake 4: Jumping straight to State-of-the-Art (SOTA) Deep Learning in personal projects

Photo by Alex Radelich on Unsplash

Do not jump and try out SOTA algorithms in the first iteration because it is compelling and looks cool. Start with a baseline model. For example, pretrained embeddings will provide a strong baseline for NLP tasks.

The baseline model could be a heuristic model or even a non-ML model! Figure out what the baseline model fails to capture that will help you set the direction for trying out new experiments.

Mistake 5: Not practicing Python/DSA questions

Even though some companies do not have DSA round(s), a DS candidate is expected to have good exposure to Python basics and data structures. In my short experience as a DS interviewer, I have seen that the candidates know python built-ins but struggle to apply/solve simple questions like merging two sorted arrays due to lack of practice.

So I highly recommend getting our hands dirty with basic DSA questions (We cannot escape LeetCode! 💻) and improving our problem solving skills.

Mistake 6: Not practicing common questions about the projects

Even though this is one of the most common questions in the interview — tell me something about this project? Many candidates spend more time on the project introduction and the metrics but a very few talk about the impact and the challenges they overcame!

This is one of the most important questions that can help you :

  • To drive the interview discussion in your favor
  • Showcase your area of expertise
  • Showcase your communication and storytelling skills

So, an ideal answer the interviewer expects you to share is:

  • Initial background of the business problem you are trying to solve
  • Who are the end-users of this solution? How are they consuming the model predictions?
  • Source of the data
  • Preprocessing steps
  • Baseline models and other experiments
  • Metrics used for evaluation
  • Model deployment and challenges

I highly encourage the readers to write down and practice the answers to the common questions. It will make you feel comfortable answering the questions in the interviews.

Mistake 7: Build a strong foundations of the basics

This is one of the most underrated advice and mistake I observe in the beginner level DS candidates. Many candidates miss the basics and jump to advanced concepts. For example, ignoring Recurrent Neural Networks(RNN) and Long Short Term Memory (LSTM) models and jumping straight to Transformers! Please avoid doing this because the basics will help you lay the foundations of advanced concepts.

Get the fundamentals down and the level of everything you do will rise — Michael Jordan.

There are many free resources which can help you get a strong understanding of the concepts. A few of my favorite resources are Stanford Online and Hands-On Machine Learning with Scikit-Learn and TensorFlow book.

However, make sure to follow standard and verified resources when applying machine learning concepts. For example, a common misinterpretation is to use Standard Scaler for feature scaling the training and testing data separately 📛

# Example of how NOT to scale features! 
# We should NEVER use fit_transform() on the test data.
from sklearn.preprocessing import StandardScalersc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

Closing thoughts 💡

Data Science interviews are challenging and there is no silver bullet to ace the interviews. Each company has its standards and procedures to evaluate the candidate. A few ideas which helped me in the interview process were to be open to feedback, analyze what better you could have done, and never give up because of failures!

I hope this blog shares insight to avoid errors and help you stand out in future interviews. Do share the story with your network and comment on your recent DS interview experience.

Please feel free to follow or connect on LinkedIn. Till then, see you in the next post :) Thanks for reading. Take care!

--

--