The world’s leading publication for data science, AI, and ML professionals.

Take-Home Exercises can Make or Break Your DS Interviews

How to tackle your take-home exercise – arguably the most important part of the interview and the part that's most in your control

Photo by 91 Magazine on Unsplash
Photo by 91 Magazine on Unsplash

Office Hours

What are take-home exercises?

More and more companies nowadays are using take-home exercises as part of their Data Science (DS) interview processes. In fact, more than 90% of the tech companies I have interviewed with so far had a take-home portion of the interview, typically after the technical screening and before the onsite interviews. You are usually given a sample dataset and a list of open-ended questions like this example here. Most companies will then give you a chance to present your results to an interviewer or a panel during the onsite round.

Image by Author
Image by Author

Why do employers love take-home exercises?

Take-home exercise is a great way to judge candidates from the employers’ perspective because it’s the closest to your day-to-day work as a DS, gives the employer a glimpse into your real potential, and allows them to see how you approach a complex problem end-to-end. For exactly this reason, many companies put substantial weight on the take-home exercise (I know this because I have served on Interview panels several times myself). I have witnessed candidates who have not done well enough in live-coding redeem themselves through take-home exercises; I have also seen candidates who passed the technical round with flying colors ruining their chances of succeeding due to the lack of attention to the take-home exercise.

How should you approach take-home exercises?

Because I know the importance of take-home exercises, they used to seem intimidating to me: "Where do I start?", "Should I spend more than 2–3 hours even though the recruiter told me I shouldn’t spend more than that?", "Should I look things up online?". I have read a lot of articles about how to prepare for DS interviews and have written several myself, but haven’t found any good resources about how to prepare and approach the take-home exercises, so I have decided to organize the tips and tricks I have learned from my experiences as both an interviewer and an interviewee to offer some ideas for how you can leverage the take-home exercise to stand out.

  1. Always, always, always QA the data that was given to you

As I mentioned, the take-home exercises usually contain a small sample dataset for you to explore and help answer the questions posed. In my experience, the data given to candidates almost always have some "mistakes" or "inaccuracies" in there; it could be some timestamps that are incorrectly labeled or lat-long info that is clearly not accurate. These "traps" are put in the dataset to test one of the most important qualifications a DS should have – the awareness of data inaccuracies in real life and the ability to effectively review and clean datasets.

Usually, the sample dataset will be small enough to eye-ball and catch any glaring mistakes; if not, try to randomly sample some rows from the dataset to test the integrity of the data. This is only the starting point of QA, aiming to catch any obvious inaccuracies in the data (e.g. all timestamps from a certain date are at midnight whereas timestamps for all other dates are spread across different hours of the days). QA will continue as you proceed to the next step of the analysis – EDA (exploratory data analysis).

2. Start your analysis with EDA (Exploratory Data Analysis)

Farcaster at English Wikipedia
Farcaster at English Wikipedia

Every DS should already be familiar with the EDA process and how to conduct it; but if you are not,

EDA is the initial process performed on the data prior to any modeling. It aims to provide initial understanding of the data, help identify necessity of data cleaning (if there are outliers or anomalies in the data) and provide opportunity to quickly form and/or test hypotheses using summary statistics and visualizations

There are convenient functions in the Python Pandas package (shape, info, describe, etc.) and R dplyr package that can come in handy at this step. I’m not going to get into the details of EDA in this article, but I want to briefly point out several important things to pay attention to with regards to take-home exercises:

  • Missing values: It’s important to pay special attention to missing values, especially if you think it will be a crucial variable for your follow-on analyses. It’s equally important to come up with a mitigation plan for the missing values (e.g. dropping the rows or fill in missing values with median/mean).
  • Outliers: By plotting the distribution of variables, it should be easy to spot any outliers; it’s important to have a hypothesis of why the outliers exist and whether it requires correction/cleaning of the data.
  • Correlation: Quickly plotting a correlation graph between all variables will usually give you a pretty good idea in terms of what hypotheses to test and if you decide to build a model for the exercise, correlation analysis can also provide guidance in terms of feature selection.

3. Be hypothesis-driven

In my previous article (5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist), I have mentioned the importance of having hypotheses before delving into the data. This will help you limit the scope and provide a direction for your analysis. It will also help you create a narrative to structure your analysis around that will help you when presenting your results to your interviewers later.

Like mentioned before, EDA would be helpful to come up with these hypotheses that you can later prove or disprove. Don’t be afraid that the hypotheses will be proven wrong; that in itself is an interesting conclusion to be included in your write-up, as long as the hypotheses were reasonable to begin with.

4. Clearly state assumptions, and back them up whenever possible

The biggest drawback of a take-home exercise, compared to a live interview, is that it’s difficult to ask clarifying questions and communicate with the interviewer. You likely won’t have all the information you need or want to complete the exercise; it could be that you are unsure what aspect of the problem the interviewers are most interested in, or that you are missing data that you need for your analysis. The best way to overcome this drawback is to clearly state any assumptions you are making; many candidates forget this, which can make it incredibly frustrating and difficult for reviewers to follow your analysis. These assumptions can be based on research or even anecdotal evidence or simply common sense (although it’s usually better if you have a robust source or data point to back them up). An interviewer may challenge your assumptions in the onsite presentation of your take-home exercise, so make sure you always have a good reason for your choices.

5. Always tie it back to business impact

In my previous article (5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist), I have also mentioned the importance of tying all analyses back to business impacts; the same rule holds for take-home exercises.

Most companies provide a sample of their real data for the take-home exercises or at least data that is representative of their real data; the questions they ask are likely relevant to their business, so it’s important to showcase your business acumen (trust me, it’s a rare skill among DS) and prove that you will be able to drive measurable impact.

6. Have structured deliverables

Everyone (hopefully) knows that recruiters spend no more than 2 minutes on each resume, so it’s super important to organize your resume in a way that is easy to comprehend. The same is true for take-home exercises – they are typically reviewed by hiring team members who are helping out with the recruiting process on top of their day jobs, so they will likely spend more than 2 minutes but not more than 20 minutes "grading" your take-home exercise. Make their job easy and keep your deliverables organized and key messages clear.

I normally have three components for my DS take-home deliverables – code+writeup+presentation deck. Make sure your code is clean and easy to read, is well-formatted, and has comments to help readers understand it. Clearly state any assumptions in comments instead of hiding them in the code itself (see point 4 above). The write-up is probably what most interviewers use to judge your analysis prior to the presentation, so make sure you follow the format of a report or research paper and have an executive summary at the beginning. When putting a deck together, it’s important to keep top-down communication in mind and start your deck with key messages before delving into details.

8. Prepare for the presentation

Unless you are a seasoned presenter, make sure you practice giving the presentation. It’s hard to stay organized and remember all the points you want to make, especially when you are under stress during an interview; so make sure you have presenter notes written down. Of course, you can’t anticipate all the questions the interviewer panel would ask, but try to stress-test your hypotheses and assumptions so you don’t get caught off guard and can defend any choices you made in the analyses.

10. Don’t be afraid of leveraging outside resources

By outside resource, I don’t mean asking your friend to do your exercise; I mean looking things up on the internet. The biggest advantage of a take-home exercise, compared to the in-person interviews, is that you can utilize any resource you can find. It’s like an open-internet exam at school, so take advantage of it. Whether it’s third-party data or competitor research, as long as it’s public information, it’s fair game. And speaking from experience, most interviewers will appreciate the extra effort and initiative you put in.

You might be wondering, "How can I finish all of the things you mentioned with the 2–3 hours the recruiter told me to spend on the exercise?". Well, the truth is, you can’t. I have spent more than 2–3 hours on every single take-home exercise of mine even though recruiters have always told me otherwise. I’m not telling you to spend a week on it (that would be overkill), but treat it seriously as if you were working on a mini-project for the team. After all, you would want this work to be an accurate representation of your work quality on the job, as interviewers are using it as a window to get a glimpse of your ability and attitude. Let the take-home exercise make your DS interview, or at least, not break it.

I hope this article is helpful for your future take-home exercises for DS interviews. I will be continuing my interview series in the next article where I’ll cover things to pay attention to during the interviews.


Related Articles