1 year doing data science in the real world

The most important lessons I’ve learned so far

Jonny Brooks-Bartlett
Towards Data Science

--

Ok, so it’s actually 14 months but it didn’t have the same ring to it as “1 year”. I’ve spent that 14 months at one company, News UK. I’m now about to embark on a new journey at Deliveroo as a consumer and growth algorithms data scientist, so I thought now would be a good time to reflect on the things that I’ve learned during my time at News UK.

I can honestly say that I’ve learned loads. Given that it’s my first job outside of academia, this statement may be obvious. But the surprising thing (at least surprising to me) is that the most important things I’ve learned haven’t been the “hard skills” in data science. Instead, I’ve found that the most valuable things I’ve learned have been the “soft skills”. I have begun to appreciate that to succeed in any role in a large corporate company like News UK, not just in data science, you have to master the soft skills.

To illustrate how important I believe these soft skills are:

I honestly believe that if I worked at News UK only equipped with the skills of SQL, data preparation, linear regression and a logistic classifier, I could be a successful data scientist provided that I completely mastered the soft skills. — Me

Now I know that’s a huge statement but even if you don’t agree with me, you may be able to understand my point of view after reading my post.

Here we go

1) Communication is key

Being able to communicate with everyone around you is the most important skill for success, no matter how good you are at your craft. Communication is the first step to building a rapport with your team and your stakeholders. Once you have built a rapport with the relevant people it’s much easier to manage expectations, get requests/favours done quickly and many other benefits.

Things won’t always go smoothly. It’s inevitable that something will go wrong. An example that I experienced was when the analytics dashboard that we had made wasn’t presenting any data for certain articles (this happened more frequently than we’d like). When the stakeholders found out before us it caused problems because the perception was that we didn’t know what the problem was. However, on the occasions that we found out first and let the stakeholders know before they found out, they never got angry. The only request was that we let them know when things would be working properly.

This doesn’t only apply to stakeholders. Within the data technology team, I found that many of my team members loved when I explained complex data science methods in an easy to understand manner. This isn’t just about being able to explain parameter estimation or how a random forest classifier works, but it also includes understanding what is important to the other members. When talking about a data science project it’s important to start by motivating the project with the business problem. This allows you to bring everyone on the same journey of thought you’ve.

I wouldn’t have had the success that I had at News UK if I wasn’t able to communicate as well as I did and I’ll make sure to improve on this as I continue in my career.

2) Empathy is important for teams to gel

There’s already been comprehensive research into what makes teams successful. The number one thing being this concept of psychological safety. However, another factor that I feel is very important is empathy.

I remember having an entire section of a lecture during my data science bootcamp dedicated to empathy. At the time I didn’t really take it in. But now I feel like I can truly appreciate its importance.

Empathy not only allows you understand your colleagues but also respond to their pain points. And it’s this that can make you a better colleague.

A simple example of someone not empathising with you as a data scientist would be that a colleague asks you to do some analysis really quickly from a long, unstructured PDF document despite you having a deadline with some other piece of work at the end of the day. This person clearly doesn’t understand the process of analysis, why unstructured PDF documents aren’t necessarily optimal for data analysis, nor do they respect the other bits of work that you have to do. This misunderstanding can lead to frictions in teams.

There are many different ways that frustrations can arise within teams too. This article titled: Engineers shouldn’t write ETL: A guide to building a high functioning data science department highlights some of the frustrations that can occur between data scientists, data engineers and infrastructure engineers. For example, data scientists write bad code which they expect the engineer to productionise right away. I’ve seen (and done) this several times in my role at News UK. This happens because of a lack of empathy and understanding.

Take some time to understand your colleagues. Find out what they do and what their pain points are. Find out what their level of experience is. Knowing these things will help you understand your colleagues. You may be able to proactively help when certain bits of work come your way that will also involve them. Perhaps the data engineer can sit with the data scientist and discuss what good, productionisable code looks like before the data scientist starts writing. The engineer can also explain their process and deadlines so that they set realistic expectations for the completion of the work. This is especially useful when senior employees work with junior employees. Notice that good communication leads to empathy.

3) Perception can be as vital as measurable value

I think the majority of people will tell you that if you’re going to develop a product you’ll need to define a metric by which success can be measured. Typically this involves an A/B test to causally determine if the inclusion of the new product/feature has had a positive impact.

However, despite the fact that the majority of products and algorithms that were produced by the data technology team at News UK didn’t go through this process, the products were still considered successful.

This is because the products were driving a cultural shift within the company. People had a positive perception of the data technology team and saw us as a team that could create products that would accelerate their progress in achieving their goals.

I didn’t appreciate this enough at the time. When the senior members of the company are bought-in it makes the team’s life so much easier. For example, it makes securing funding much easier. Remember, you’ve still gotta play the game of politics.

Other important lessons

  • Find ‘data champions’. These are people outside of the data team that are strong advocates of data. They’ll be able to influence others within their teams and also act as product owners for your projects.
  • The people that talk the loudest tend to succeed. This may just be something inherent to big corporate companies with layers of bureaucracy and abundant politics but it feels like the loudest characters tend to rise up the career ladder in those companies.
  • Following on from being the loudest, you need to be able to sell ice to an eskimo. Not only do you need to shout about what you’re doing but you have to sell it too. That’s how you tend to get executive buy-in.
  • Every good machine learning project requires solid EDA and descriptive analytics. It’s tempting to just do the “cool machine learning” model that makes predictions and makes people believe you’re a magician. However, to get the most out of your data you have to understand it and that means exploring it. You won’t get away from standard analytics.
  • Robust code that is continuously monitored is necessary for longevity of a code base. Something will go wrong in production but you can mitigate this by building robust pipelines. Furthermore, if the outputs aren’t continuously monitored then the performance is a mystery. When the original authors have left the company it’ll only be a matter of time before that code base is replaced.
  • Solving business problems isn’t as simple as suggesting solutions. Just because you offer a solution it doesn’t mean that people will listen. You have to be ‘smart’ about it and give feedback in such a way that the relevant people will listen. Get the relevant people aligned on the goal and then take them on your journey to the solution in such a way they believe that have been jointly responsible for it. Scott Shipp explains this very well in his article: “How to ask questions that drive change at work”. In many cases, the solutions that you come up with on your own aren’t optimal anyway, so it’s beneficial to get the input of others. But it’s a mistake to believe that people see the world the same way that you do.

There are many more important things that I’ve learned over the past year but I feel these are the most important ones.

What do you think? Are there important lessons that I’m missing? Do you disagree with any of these? Feel free to continue the discussion in the comments. Thanks for reading ☺

--

--

Data scientist at Deliveroo, public speaker, science communicator, mathematician and sports enthusiast.