7 Steps to Landing Your Dream Job as a Data Scientist

Navigating the Data Science Interview Process

Josh Bernhard
Towards Data Science

--

Getting a job is hard. I recently went through the process of finding a new role as a data scientist, and I would like to help others successfully navigate the process of landing their dream roles as data scientists with a few key tips:

  • Tip 1: Find the role you want.
  • Tip 2: Getting a job means dealing with rejection. Be PERSISTENT.
  • Tip 3: Study Statistics, Machine Learning, SQL, and Python.
  • Tip 4: If you want the job, go above and beyond.
  • Tip 5: Study the company culture, people, and business models.
  • Tip 6: Negotiate and leverage.
  • Tip 7: Choose the role that fits best for YOU.
  • Extra: Helpful Resources to Help You Prepare

Tip 1: Find the role you want.

The title Data Scientist is used to mean different responsibilities for different companies, and sometimes the title means different responsibilities even within the same company. The Head of Data Science at AirBnB discusses their three roles of data scientists within AirBnB, which may differ from other companies. Data Scientists at AirBnB work on either analytics, algorithms, or inference.

Dan Frank (Data Scientist at Coinbase) discusses the roles of Data Scientists at AirBnB

At Lyft, they divide the roles of data scientists into two groups. Data scientists who work on sophisticated algorithms that are associated with matching riders and drivers and scaling these systems are known as Research Data Scientists. The second set of data scientists are known internally as Data Scientists, which was a role for which I applied and interviewed.

From my interactions with the this team of data scientists, they act very much as Product Data Scientists determining what metrics matter most in measuring the success of products, how to measure these metrics, how to improve existing products, and what new products to pursue.

The other companies I interviewed at had smaller data science teams with less defined roles. At these companies, I was concerned with the problems Data Scientists were trying to solve for the company. It was important to me that they had systems that were already in place to ensure that the endeavors of data scientists at these companies would be successful.

  • For example, had they hired the engineering talent already to collect the data necessary to answer their questions of interest and build out data science products?
  • Was there someone in leadership that created the right sort of hype associated with data science applications?
  • Were there both realistic expectations of timelines for building out certain data science applications, as well as an understanding of how data science would be integrated into organizations around the company?

Ask questions about how you (and other Data Scientists on your team) are expected to make an impact on the company goals.

These questions allow the company to understand what you bring to the table, as well as allowing you to understand what their expectations are for you. Given the versatility of skills associated with the title Data Scientist, it is important that both you and your future employer are on the same page.

Tip 2: Getting a job means dealing with rejection. Be PERSISTENT.

Once you have completed the process of figuring out the types of roles you should apply for, you need to apply to these roles. In my experience, being pulled from the stack of applications that were submitted for a role is an impossible task.

This does not mean that you can skip the step of entering your application into the pile of applications being submitted.

Instead it means that you must find a connection in addition to submitting your application. Below is a discussion about getting recruited held by Udacity recruiters and hiring managers; notice networking pops up a few times, as well as the importance in finding the hiring manager associated with open positions. Specifically, you might look at timestamp 5:00–9:40 (though this entire video is useful for getting your foot in the door).

A couple resources I found helpful in networking and gaining referrals:

  • LinkedIn: I reached out to previous co-workers, friends, and others in my network who were posting jobs.
  • AngelList: I used angel list to search jobs at startups.

Additionally, I used both of the above to find recruiters and hiring managers associated with the jobs I was applying for to reach out to them. You should not expect a reply from reaching out to these individuals. However, you should show them you can be professional, polite, and persistent, and often they will respond if you’re a good fit for an open role.

If you know someone internally at a company where you are interested in working, leverage your relationship with that individual to see if they can learn about the position for which you applied. At the same time, do not hurt your chances of working with individuals in the future. If they do not believe you to be the right fit, thank them for their time and move on.

We all experience rejection. Below is an example of one of many rejections I received. I chose a particularly nice one, because it made me feel better about all the rejections I received! There were many!

Some companies will not respond to your application, or may not even send you a rejection after completing on-site interviews. They may just leave you hanging in a state of unknowing. Your goal as a candidate is not to berate the interview process of any company. Your goal is to get a job. Stay focused.

It doesn’t matter how far into the interview process you make it; if a company does not believe you to be the right fit, again, thank them for their time and move on.

The talk below is by Greg Kamardt, who now works as a Data Scientist at Salesforce. If you are serious and committed to your job search, he has some very useful tips to building a funnel associated with your job search. In the first 2 minutes of the video, you get an idea of Greg’s ability to persist. He was able to withstand rejection in order to land the one job offer that was a match for him and the company.

The newer you are to the field, the harder getting your foot in the door is likely to be. Some people are lucky and land a role quickly, but I can tell you that is the minority. Landing a job for most means experiencing a lot of rejection. You must be PERSISTENT.

Tip 3: Study Statistics, Machine Learning, SQL, and Python.

Once you have successfully caught the attention of a company interested in your skills, you enter what I will call the “interview pipeline”. The screening process for most jobs follows a similar pattern.

  1. Initial phone interview
  2. Technical phone interview
  3. Take home assessment
  4. Onsite interview
  5. Final negotiations phone call

In the remainder of this section, I will dive into the details of each part of the interview pipeline. If you are currently going through interviewing, this part of the post might prove the most relevant to you.

Step 1: Initial phone interview

The first touch point for candidates in the interview pipeline is frequently an informal conversation with a recruiter or hiring manager. During this conversation, you are provided more context about the position. You are also informed about the steps of the interview process.

It is useful at this stage to have a few questions ready about the role, as well as questions about the company. Questions about company culture and values, the individuals on the team, and responsibilities of the specific role are all good to have with you for this interview.

At the end of the interview, you will set up a date and time for your first technical phone screen. An email usually follows with this information as well. You should make sure to be appreciative of the time and explanations given during the interview, and let them know you are looking forward to the next steps of the process.

At this point, you are hopefully more excited about the role and joining the company.

Step 2: Technical phone interview

In the second step, you are likely to be asked about questions related to statistics and machine learning. It is important that you not only get questions that have a right answer right, but also that you communicate your ideas clearly and concisely.

I was asked questions related to Bayesian statistics, linear regression, interpreting the results of an analysis, and defining metrics. In many cases, the questions were simple and straightforward. There were not tricks to the questions I was asked, but I have heard others who have had experiences where questions with a specific, tricky answer was required to pass.

In my interview cases, the interviewer appeared to be checking that I had fundamental knowledge in working with data, and I could communicate to non-technical and technical audiences. The companies usually tied these ideas to their own business objectives, but sometimes they were questions that you could find on a checklist of interview questions from a book or blog.

Below I have provided a couple of questions similar to those that I answered.

Example (Part a):
How would you describe the difference between linear and logistic regression to a non-technical employee? You may choose to provide examples of when you might use one or the other to assist with your description.

Possible Answer (Part a):
In both linear and logistic regression, you are using input variables to predict some output variable. For example, you could use the day of the week and the month to predict the sales for each day. The sales would be the output variable, and the day of the week and month are the input variables.

This example would be an example of a problem where you would use linear regression, because the output variable (sales collected each day) is a continuous value similar to age or height. For logistic regression, the output variable in logistic regression is commonly one that only has two possible values. For example, we might use logistic regression to predict if a customer will buy or not buy a particular product.

Example (Part b):
From this regression, you would get coefficients. How would you describe the coefficients you would get from this model. Do you know how the interpretation of the coefficients would be different for the logistic regression model?

Possible Answer (Part b):
I chose day and month as the inputs to the model, which are categorical variables. Quantitative variables tend to be easier to interpret. But that’s ok. We would want to change these into a number where the order gave the sequence of the days or we would want to dummy them.

Let’s say I dummy them, then one of the values would be a baseline. For the days of the week, the baseline might be Sunday. Then the coefficients would be attached to dummy variables for the other 6 days of the week. If the coefficient for Monday was 12, this would mean that the sales for Monday are expected to be 12 higher than for Sunday in the same month. If the coefficient for Tuesday was -15, this would mean that the sales for Tuesday are expected to be 15 less than Sunday. It is always a comparison of the day to the baseline.

For the logistic regression model, the response is the log odds, and not just the predicted response. It is common to take the exponential of the coefficient, and then interpret the resulting value as a multiplicative change in the odds of being in the 1 category as compared to the 0 category. If 1 was someone making a purchase and 12 is the coefficient associated with Monday, then exp(12) would be the multiplicative change in the odds of making a purchase when it is Monday compared to the baseline of Sunday.

Example (Part c):
Explain how the coefficients in your model are calculated. What makes these values better than say just random values for each coefficient?

Possible Answer (Part c):
It is common to actually start your models with random values for the coefficients. These values are then updated using gradient descent. You can think of the updates to the coefficients as moving in a way that minimizes whatever error you would like to minimize. In regression, it is common to have an error as the sum of squared differences between the actual and predicted values.

You can then think of gradient descent as an algorithm that changes the coefficients over many iterations, in small ways that always work to reduce the loss function (in this case the squared difference between the actual and predicted values). Therefore, at the beginning of the algorithm, they aren’t better than random. However, at the end of the algorithm, they will be better in terms of minimizing whatever loss function you are interested in minimizing.

In this example notice, the language used to answer each question. I try to use non-technical language in answering the questions. If the interviewer wants you to dive into the technical details, they will ask a question that prompts a dive into the technical parts they care about. The first responses were easier to keep less technical, but often the interviewer will continue to push for more technical aspects to gauge your level of understanding of each topic.

At the end of my interviews, many companies commended me on my answers being approachable. Some people try to drown their interviewer in technical jargon; I would not recommend this approach. I frequently tried to use the interview as a way to show I can communicate my ideas to both technical and non-technical members across an organization.

Other examples of questions asked at this point in this process were:

1. How would you design an experiment to test ___________?

2. Imagine you have an urn with some number of blue, red, and white balls; what is the probability of choosing two reds before choosing either a white or blue?

3. Imagine you have two coins, one is fair and the other has two head sides. You randomly choose one of the coins, and flip it 5 times. All of the flips are heads, what is the probability it is the fair coin?

4. How are posterior intervals for estimating a parameter of interest different than intervals obtained using frequentist methods?

Step 3: Take home assessment

The take home assessments varied pretty heavily from company to company. In some cases, a company clearly offered a take home to make sure a candidate could write code, arrive at a solution, and would be able to contribute without needing someone looking over their shoulder. In this first case, creating any basic end-to-end solution was sufficient for “passing”. In other cases, the take home was EVERYTHING that mattered in getting the job, and it was extremely important to put as much time and effort entire your solution as humanly possible (this is only a slight exaggeration).

With this in mind, I am going to introduce another tip here.

Tip 4: If you want the job, go above and beyond.

It seems fair to me to ask your recruiter or hiring manager how the take home assessment would be “graded”. If the take home is graded by multiple reviewers and/or they ensure your identity is anonymous in the checking of your assessment to avoid possible bias, it is safe to say the company takes the take home assessment very seriously. Alternatively, if the hiring manager just asks you to complete it and try to show off your skills, this part of the interview is just a simple check of your skills.

In either case, you should put your best foot forward. However, in one case you might spend your entire weekend from morning to night checking every idea that might be interesting, and in another, you can probably hack together a reasonable solution in a few hours that will move you to next steps in the process.

I found it interesting that during this part of the process, I thought it would be really important to write really clean code, include document strings, add unit tests, and ensure that what I was doing followed DRY principles, but none of my interviews seemed to care too much about this. Perhaps this was due to the limited time associated with completion.

No matter how much emphasis the company places on the take home assessment, you can use the assessment to stand out against any other applicants. Doing more on the take home will help impress the hiring manager and increase your chances of landing an on-site interview. If you really want the job, you should go above and beyond to stand out.

In all cases, the interviewers seemed to care about:

  1. How creative was my solution? (did I engineer together interesting features)
  2. Did I communicate what I was doing and why?
  3. Did I answer the questions they asked? (as opposed to a question that was not asked)

Step 4: Onsite interview

After completing the take home, the next step in the process is an onsite interview. For onsite interviews, you frequently meet with many individuals that you would work with on a regular basis. For every company, performing well in the onsite is crucially important to getting a role. Even if you provided the most amazing solution ever seen from the take home assessment, you will not get an offer if the onsite interviews do not go well.

I know this sounds stressful, but honestly this was my favorite part of the interviewing process. You get fun questions, and you can talk through with another person how you would go about solving interesting problems. To me, that was fun!

With this in mind, how do you ensure the onsite goes well? Well, first let’s consider what are the different aspects of the onsite. In my experience, I had 4–6 meetings covering a range of topics. Typically, these topics included those listed in the title of this portion. Additionally, there were conversations to see how well you could tie each of these ideas to each companies’ business. Here, I will add our next tip, but this blends a bit with the earlier point.

Tip 5: Study the company culture, people, and business models.

Though this tip doesn’t show up in all of the below points, it shows up at the end of the list, and it is probably the most important thing to keep in mind for the entire on-site interview.

All interviews contained the following components:

  1. SQL: Writing out SQL on a whiteboard to combine data sources. In my experiences, I was not asked any questions related to performing SELF JOINs or WINDOW FUNCTIONs. I have heard from others that some companies do ask questions related to these topics.
  2. Python: Many of the questions asked of me had data related to time. Knowing how to work with days, months, years, and create appropriate indicator variables proved to be useful for take home and in person interviews.
  3. Statistics: Having a strong grasp of exactly how certain measures of similarity or difference are calculated was very helpful. I was also asked basic questions about probability and probability distributions including using Bayes’ Rule in a number of interviews.
  4. Machine Learning: Being able to talk about why using Euclidean distance might be better for certain cases instead of a Spearman or Pearson correlation coefficient, how splits are made in a decision tree, or the pros and cons of stochastic gradient descent compared to batch gradient descent all proved to be useful topics. You don’t have to know every last detail, but the more you know and can leverage the better.
  5. Python: For all take home assessments, I heavily leveraged the use of python. The main libraries related to data analysis were used to get me most of the way there: numpy, pandas, scikit-learn, etc. It didn’t seem like most companies cared to much if I would have wanted to use R. However, it seemed like once you got inside the company, you largely were going to be using Python. Many of the in person interviews had a component where being able to think about how you would solve a problem in Python would also prove useful even if you weren’t asked to write it.
  6. Business: I cannot stress this part enough — research the company. How do they make money? How does your role tie into the business goals? Learn as much as possible about the company. Understand what they do, who their customers are, and how you can help them both sustain and grow. Understanding their competitors helps too!

I am very confident that some of the interviews I did not get or that I failed early in the process were because I didn’t study the business and culture of companies well enough. The more you know about the company and the people in it, the better you will perform at all parts of the interview.

Learn about the product, and determine how you can contribute to making it better.

Step 5: Final negotiations phone call

Should all go well, you will make it to a final call where you are extended an offer. Though there are no technical details to this phone call, it is important to be prepared. In my experience, when an interview was going well, some negotiations around salary even occurred during the on-site interview process. This leads to another tip.

Tip 6: Negotiate and leverage.

I am admittedly not great at this part of the process, but here are my recommendations. When you are asked about your salary expectations,

  1. Give a range.
  2. Talk about what you have researched to be true about the pay for similar positions and title.
  3. Leverage what you bring to the table that sets you apart from other candidates.
  4. Stress how excited you are about the work and team. Money is one factor of the job; it shouldn’t be the main reason you are choosing a role. If it is, you and the company should both be worried.
  5. You certainly are welcome to negotiate, and many companies expect you to try and negotiate. They may say no, but they are not likely to revoke an offer simply because you negotiate. Offering reasons can be helpful; is there a benefit that isn’t offered that you need to cover yourself? Did you get competing offers and you would like a match?
  6. Some people suggest trying to get them to always say the first number. I think if you give a range that you are truly expecting to land in, at least you have a starting point for both parties.
  7. Note, in some areas of the world, it is illegal for them to ask you what you are currently making. They should pay you based on what you provide to the company.

Example:

At the end of the on-site interview process:

Recruiter: So, we have reached a point where we need to talk about salary expectations. It seems like things went really well, and the team likes you from my conversations so far. Do you have a number in mind?

You: The going range for this role seems to be 100,000 to 140,000 for my level of experience. Since I have additional experience related to the tools the team is using, I would expect that to also help.

Recruiter: Alright, I will talk to our finance team to see where we land.

Then during the final phone call:

Recruiter: So I chatted with everyone and we would like to extend you an offer. First let’s go through the benefits… chatter about benefits… (additionally you might have) we should also talk about stock options… Do you have any questions about that part?

You: Nope, that all sounds good to me (or ask questions as you have them).

Recruiter: I have talked to our finance department, and we can give you an offer at 88,000. Given all of the other benefits, you are definitely falling in the window you provided.

You: I really like the team, and I want to join. However, I was really expecting to be in the range we talked about. I actually expected to be towards the upper end of it given how much value I can add almost immediately to the team.

Recruiter: I was given a little wiggle room, I can come up to 95,000.

At this point, it is all about how much you want to push, but you shouldn’t threaten or say something you do not follow through on. Additionally, you should not lie about your current salary to try and push the bar higher. Often companies will ask for W2s when you are hired, so they will know you lied. That isn’t exactly the foot you want to start off on with a new employer.

You: I unfortunately was really hoping to make this work, but that is still lower than I expected based on industry standards. Is there any way we can meet around 125,000.

You can continue this process until you land on an agreement or you decide you cannot make something work. Assuming things go well, you can ask for time to make final decisions and think things over. This is completely reasonable. Most companies are comfortable with you taking one or two weeks to make final decisions.

If you are lucky, you may even have options. There can be a lot of opportunity out there once you have opened your first door and made the right connections. In these cases, I have a final tip.

Tip 7: Choose the role that fits best for YOU.

Determine the team you want to work on, the problems you want to solve, and the impact you want to make. There isn’t one right decision for everyone. Make the decision that is best for you.

Extra: Resources to Help You Prepare

Here I have provided some helpful (free) resources to help you prepare for different components of an interview for a Data Scientist position.

SQL
Udacity’s SQL Course
SQL on w3schools
SQL Zoo

Python
Python Codecademy
Python I Udacity
Python II Udacity
Codewars for Python and SQL
HackerRank
LeetCode

Machine Learning
Luis Serrano’s Youtube Channel
Introduction to Statistical Learning Text
Udacity Machine Learning
Technical Interview Algorithms on Udacity

Statistics
Descriptive Statistics on Udacity
Inferential Statistics on Udacity
Statistics on Khan Academy

Soft Skills
Soft Skills on Alison

I will continue to add to the list over time, but this is a start. If you have additional resources that you have found useful, please feel free to add in the comments. I can then update the list to help others prepare! To get you excited about your future roles as data scientists, check out the video below that discusses a number of topics (including more on interviewing)!

Thanks for reading!

--

--

I communicate in a way that some people like and some don't. I like plaid. The views expressed here are my own.