The world’s leading publication for data science, AI, and ML professionals.

3 Mistakes Startups Are (Still) Making with Analytics

Move fast and break things – but still be data informed. Startups must tailor their data analytics practices to focus on on delivering…

First, let’s define who we are talking about. For startups from Series A through C, data Analytics is probably supported by a small (or no) data science team, with help from BizOps, PMs, or a Chief of Staff. Analyses serve a diverse set of stakeholders, and is rarely already routinized. As the organization scales and iterates its strategy quickly, analytics needs to keep up: the team needs to provide many answers, and do so quickly.

But while most startup leaders want to make better use of data, it’s rare that they feel they’re getting maximal value. Analysis even becomes viewed as a luxury which slows things down, rather than the decision accelerator it should be. But we can fix that.

Here are the top learnings from our partnerships with startups, as they’ve grown analytics in their orgs.

  1. Don’t overinvest in dashboards, because they are tools designed to create slowly changing views. Your business is moving way faster than Tableau can.
  2. Everyone can analyze data, especially given modern data analytics tools. Walk through the analysis process and figure out where people would get stuck, and buy software or allocate training time to unblock your team.
  3. Data science is all about good vs. evil – specifically targeting the "good" or proactively reacting to "bad." The more narrowly you can define what "good" means, the easier the analysis becomes.

Mistake 1: Overinvesting in dashboards

This is not to denigrate Business Intelligence (BI) tools as a whole. Dashboards are like fortresses; they defend a specific area very well and should require little upkeep over time. You should absolutely have dashboards for key metrics and monitor them. Figure out what you need to review on a regular basis, and build these views.

But there is a temptation (because it is easier) to keep building more and more dashboards. Conversion problem? Make a conversion dashboard with conversion rate and a bunch of filters. Manufacturing line delays? An issues dashboard with a bunch of filters. But it’s a losing bet that just making statements of historical fact available will change the system.

This is pretty, but it is not going to protect you. Photo by William Olivieri on Unsplash
This is pretty, but it is not going to protect you. Photo by William Olivieri on Unsplash

Instead, what we want is a way to attack new strategic questions at speed, and be able to move between many views.

  1. Ugly is fine: Give analysts full reign to create ugly (documented, reproducible) results. Until the board needs to see it, no PowerPoint and no dashboards.
  2. We’re building pipelines, not visuals: We need to create an end-to-end workflow, not just a visualization of existing data. Use tools that enable rapid prototyping starting from robust data prep. BI tools are clunky when moving across views quickly.
  3. Ask for predictive analysis: Describing the past is just okay. If I give you a set of future customers, I want to know who among them buys, who churns, etc… BI tools have started to reach into predictive spaces, but primarily still work best to state historical data.
  4. Talking live moves things along faster: Dashboards are good asynchronous sources of information. However, the stakeholder team should be willing to invest a solid block of time to meet live and work through large parts of the problem dynamically. Having a live conversation is the fastest way to communicate, and data analysis is no different.

If you want to hear more about this area, we recently benefited from a talk by Anthony Deighton, who has held multiple executive roles across Qlik, Celonis, and Tamr on "Why Dashboards Are Dead: How the Viz Mafia Led Decisionmakers Astray."

Mistake 2: Letting team members believe they "are not data people"

There are data problems that require advanced degrees, but no one is being asked to build a self-driving car. Day-to-day problem solving and low complexity data analysis can be owned close to the functions that the insights are intended to serve.

Data Science questions have always organically come up in discussions, planning meetings, and retrospectives. The challenge is, sometimes they are tossed out into the ether with the abominable prefix "I wonder…" and left to just hang there as no one is responsible for answering it. The overworked single data analyst cannot raise their hand to take on more work. So it is an extraordinarily frequent occurrence that the question just causes a thoughtful 2 second silence, and then discussion resumes on the next order of business.

Instead, people need to start taking ownership over the 2 most important questions:

  • How much more (or less): Basic access to KPI data and ability to sum, subtract, and divide these metrics.
  • Who or which factors contributed: Identify key drivers, use basic statistical modeling to cut away variables that don’t matter.

Go through these steps with your working team, and make sure they can sequentially execute on each of these components of analysis. Take a simple question (how does this month compare to last month) and work it through.

  • Data is in a structured environment: If data is for some reason not yet accessible, go look at a combination of data transfer tools like Airbyte / Fivetran and scalable warehouses like Snowflake.
  • The right subset of data needs can be isolated: Many no code tools exist to turn SQL into something simple. Developments like visual querying goes one step further, using data visualizations as a tool for interactively filtering itself.
  • Intuition can be applied to data cleansing: A tool should scan the dataset, and make it easy for the user to spot values that seem out of place, averages that are too high, etc… and make any necessary transformations.
  • Descriptive analysis is responsive: Visualizations should be drag and drop to create, and fast to render. It should also be easy to relate different variables to each other. This is the lowest bar to clear.
  • Driver identification highlights important factors: No one has the time to explore each variable individually; we should be pointed to the highest over- and under- performers.
  • Predictive modeling is possible, easy, and guided: Automated machine learning is everywhere, and users shouldn’t need code to combine many drivers into a unified prediction of a target variable. Remember, ML is not scarier than any other analytic tool, use it as a superpowered bar chart.

If someone on the team feels like some of these steps are not yet possible, try to figure out why (most likely, it’s because Excel doesn’t work on everything). It might mean buying some SaaS product off the shelf to solve the problem, it might mean taking a day to watch a few YouTube tutorials. Investing in democratized analytics will pay dividends for both the team’s individual development, and company analytic potential.

Between 17 and 71 million ways to get it right - Google, retrieved 28 Oct. 2021
Between 17 and 71 million ways to get it right – Google, retrieved 28 Oct. 2021

Mistake 3: Going fishing when you want to go hunting

This sentence is nightmare fuel for any data analyst: "tell us some insights, we have lots of data."

Your team, while running data analysis last week - Source: stephen momot on Unsplash
Your team, while running data analysis last week – Source: stephen momot on Unsplash

Certainly data (dashboards!) can point you towards broad opportunities; for instance, which metrics might be off-target. But at a strategic level, you have 10 decisions to make, you should add data analysis to all of them, but really only have the resources to add data-driven insights to 5 of them. Spending sparse analyst time to go fishing should be reserved for hackathons, with interns, and after a Series D when you can hire a hundred data scientists.

But it’s actually really difficult to frame a data science question correctly. Even picking a specific KPI to target is insufficient. Let’s say we are trying to improve customer satisfaction. Are we trying to identify part of the workforce for training? Surprising and delighting customers randomly? Working to remediate bad customer journeys before? During? After? Etc…

When assigning someone to take on a "data science" task, the more precisely we define the task, the more actionable and easy to solve the problem becomes. Breaking it down:

  1. Narrow down to a treatable segment: When two elements of the analysis group might perform very differently, this will simplify analysis and prevent spinning wheels. This can be a customer group, a process type, a machine spec, a geography, etc… Trying to run some analysis for new and existing customers at the same time is probably impossible.
  2. Align to action: Focus analyst time on questions that are actually relevant, and help to make go/no go decisions. If no one can name the potential changes that could occur as a result of analysis results, we’re probably doing "analysis for analysis’s sake."
  3. Target variables should be specific, not aggregates: For example, we don’t want to forecast the number of process failures in a month; we want to find which processes will fail in the hour or day before it happens. There’s almost always a real "event" that we are trying to predict, and it should be clearly trackable.
  4. Target to increase ROI: You can always do every activity to everyone, but most initiatives come with a cost (even sending an email). Cut down a full population to target, and use variables that you know about everyone. Even a 1- or 2- variable segmentation in a simple bar chart can save you half the cost of a program.

An advanced data science practitioner will force the prompt giver to provide all these pieces of information. But organizational leaders should start requesting analysis to support strategic plans according to these components.

Conclusion

These problems do exist to some degree in larger organizations. However, with more overall data science bandwidth and a slower decision making process, friction in data usage tends to be less costly. At startups, the "data team" is many-hatted and dashing between priorities. We want to make sure teams spend their time on the right things, and that they are efficient for everything they spend time on.

These three pieces ultimately provide mutual support. Focusing data efforts on timely strategic questions will naturally refocus the organization away from dashboards towards bespoke analysis. In order to support this analysis, teams will make time to upskill to meet these tasks. And to help teams early on, when analysts are a bit less experienced, taking the time to clearly define problems makes questions easier to answer.

This blog was originally published on https://www.einblick.ai/blog/three-mistakes-startups-make/

Einblick is a visual data computing platform, allowing users to quickly build data science workflows – with and without code. Register for a trial at https://einblick.ai/try-einblick/


Related Articles