Notes from Industry

What to Look for When Scaling Your Data Team

What should managers look for when growing their teams? And what tools can provide relief for their already overburdened staff?

Sheel Choksi
Towards Data Science
7 min readJul 27, 2021

--

Photo by Brooke Cagle on Unsplash

For any person who works with data, you’ve likely heard the phrase “data is the new oil.” However in today’s digital economy, if data truly is the new oil, we’re going to need a lot more people working the rigs and refineries to keep up with the demand.

Today, data-driven innovation has become a strategic imperative for just about every company, in every industry. But as organizations expand their investment in analytics, AI/ML, business intelligence, and more, data teams are struggling to keep pace with the expectations of the business.

For example, many data engineers are often overburdened by the time required to build and maintain data pipelines. As a result of these backlogs, data analysts and data scientists often experience significant delays that prevent them from accessing the data they need to do their work.

Using the oil analogy, think of it like this: Anytime oil rigs and pipelines experience delays, the refineries will also slow down, which can have a major effect on distribution channels and, ultimately, consumers driving up to the gas station. The same is true for data — any backlogs across the data lifecycle can lead to delays that prevent downstream users and consumers from accessing, analyzing, and deriving insights from data.

Businesses will only continue to rely more heavily on their data teams. However, recent survey research suggests that 96% of data teams are already at or over their work capacity. To avoid leaving their teams in a lurch, many organizations will need to significantly scale their data team’s operations, both in terms of efficiencies and team size. In fact, 79% of data teams indicated that infrastructure is no longer the scaling problem — this puts the focus on people and team capacity. But what should managers look for when growing their teams? And what tools can provide relief for their already overburdened staff?

Building Out Your Dream Team

The first step that managers of data teams must do is to evaluate their teams’ current skills in close alignment with the projected needs of the business. Doing so can provide managers with a deeper understanding of what skill sets to look for when interviewing candidates. Some considerations to keep in mind include:

  • The skill sets of the existing team. Creating value from data often requires a mix of skill sets spanning from data engineering, data analysts, data scientists, etc. Oftentimes with smaller teams, one person might be covering several skills. Evaluating where your team could use more help and/or experience helps to identify high leverage hires. This team expansion can also be a great chance to find out parts of the work that someone on the team is covering for and would love to reduce time allocated — for example, data engineering might be an area that a few people are pitching in on but would much rather focus that time on analysis; bringing on a data engineer not only instills that practice with passion but allows the other folks to allocate more of their time on their own passions and value-added activities.
  • Identify bottlenecks. Data collection, cleaning, aggregation, analysis, machine learning, reporting, etc all work together to create business value. Like all bottlenecks, slow delivery or issues with one typically limit overall output, subsequently reducing team morale. Even if the number of analysts greatly outweighs the number of data engineers, if the end output is still mostly constrained by analyst bandwidth, then it often still makes sense to add more there. As discussed previously, it’s easy to look at a skill set, for example data science, see that it doesn’t exist on the team and look for that hire. The highest performing teams, however, evaluate this skill gap holistically: perhaps the business isn’t ready to manage machine learning effectively or the needs can be met with a platform that doesn’t require a FTE and the savings would be better suited to doubling down on another skill set.
  • Keep in mind the newness of a modern data team. We’ve all seen job postings requiring 10 years of experience for a technology that’s only been out for 6 years. Although the practice of data has been around for a while, certain roles and skill sets (like modern data science, data engineering) are much newer. In looking for great candidates, it can often make sense to lean more on past experience in relevant adjacent roles rather than the precise role / job title. For example, a software engineer may have all the right background to work as a data engineer but without the very specific job title.

Ask the Right Questions

Next, managers need to carefully consider the interview process. Going beyond the general, HR-mandated interview questions, managers should use interviews as an opportunity to drill into each candidate’s skills, ambitions, and aptitude to learn how they might benefit your growing team. Here are topics managers should be covering:

  • Passion for the data space. Regardless if new to industry or a long tenured practitioner, the best candidates continue to have curiosity for learning and a passion for delivering. To surface this excitement, managers can discuss past projects / side projects or practices and technologies the candidate is interested in trying out next.
  • Technical acumen. Most roles in the data space have a technical component to it. Analysts are often expected to have a strong grasp of SQL (not just the basic group by and counts, but answering more complex questions like using lead/lag to determine the median time between user transactions). Working through a small, real-world (not programming puzzle) SQL analysis can help ensure the candidate’s skill set meets the requirements. Working through it together in a “building” session is a higher time commitment but is often much more insightful as to what it looks like to work with the candidate. These same building sessions apply to other roles on the team.
  • Tie in to business objectives. Most data teams don’t have “research” as an end objective. Instead, research, model training, providing reporting, creating analysis and recommendations are all tied to creating outcomes. Managers should vet a candidate’s drive for owning outcomes and their business acumen. Their business acumen is especially important considering how close data teams are to company leadership.

Watch Out for Red Flags

Referring back to our “data is the new oil” analogy, it’s no secret that intensive work on an oil rig is not for the faint of heart. We’ve all been there — whether it’s on a rig, a data team, or basically any type of team — a candidate aces the interview, looks great on paper, and signs on to officially join your team. It’s only until later you realize this individual is probably not well suited for the role or the team. While hindsight is 20/20, there are several common red flags to look out for during the hiring process.

  • Mismatches between the candidate’s interests and the company’s challenges. Some data engineers are driven by the volume of data that their pipelines process, others are driven by the rate of business value they can deliver. A data engineer excited to process TB of data a day for a small startup that is working with a few GB a day will likely self-select out of the hiring process, but in case they don’t, it’s worth double checking the other factors that are creating an interest for the candidate.
  • Interest in only one part of data responsibilities. Outside of very large data teams that can support fine-grained data roles and responsibilities, most teams require a couple of different types of tasks to deliver end business value. Candidates that aren’t willing to pitch in on some extra data cleaning and validation (a task that might traditionally fall to data engineering) may not be the best fit for team success.

Automation: Your Favorite New Coworker

The ability for businesses to operationally scale their data initiatives requires more than just increasing team size. Today, data teams are inundated with the amount of work needed to keep up with the rapid pace of business, so managers should consider looking for automation tools that can enable the team to get the work done faster, more efficiently, and without bottlenecking the data lifecycle.

Automation technologies empower teams to essentially “autopilot” the small-scale tasks, so they can focus on more complex and innovative initiatives for the business. The previously mentioned survey research also found that a majority of data professionals (73%) view automation as an opportunity to advance their career. Ultimately, with the right team and the right range of skills, these automation tools can act as a way to significantly augment your team’s bandwidth and output.

Another approach to automation technology that I see tremendous value in is flex-code. Despite the huge surge in low- and no-code tools, many teams have discovered serious limitations with these technologies. The challenge is that many of these tools don’t give teams the ability to customize code when needed, which can prevent teams from reaping the benefits of low- and no-code.

While dependent on the needs of the business, it’s ultimately up to the team managers to decide how to best scale their team and where to implement automation to achieve their overall goals. But with the right people and the right tools in place, businesses can deliver insights faster and capture greater value from their new most valuable natural resource: data.

--

--