The world’s leading publication for data science, AI, and ML professionals.

How We Created an Ad Hoc Analytics Process

Bringing together business and data science stakeholders with a centralized queue for analytics requests

How We Created an Ad Hoc Analytics Process

Photo by Daria Nepriakhina 🇺🇦  on Unsplash
Photo by Daria Nepriakhina 🇺🇦 on Unsplash

How do you manage Analytics requests coming in on an ad hoc basis? What began for us as a shared email listserv years ago matured into a full-fledged process for capturing, vetting, resourcing, and sharing the output of all "Ad Hoc Analytics" (AHA!) for our data science team. It’s now complemented with a front end for stakeholders to submit requests and a back end for analytics teams to triage incoming tickets. Multiple teams contribute to the process, for whom it serves as an entry point for all new or unplanned analytics.

As our team of data scientists designs features, experiments, or models in project development, they become experts in the ancillary data and subject matter domains. Becoming a de facto expert in some domains means other teams will ping us for advice and EDA, and it’s easy to let our long-term, strategic priorities languish when context-switching between those ostensibly small inquiries. But we don’t want to ignore those questions either – data is a competitive edge that accumulates over time and we aspire to support all teams as the in-house data experts.

Our request process is a tool to help our Data Science team balance this need for short-term investigations against committed projects and long-term, strategic priorities. Our thinking on the utility of this centralized queue has evolved over time. Here, we’ll review the reasons our data science team instituted and continues to support this kind of process, and what we learned works (and doesn’t) in managing our "AHA!" queue.

Why have an analytics request process?

Initially, our humble email inbox was a catch-all for the disparate and intermittent research requests for our fledgling data science function. As our teams of data scientists and business analysts grew over the years, the scope and utility of our intake process changed. Today, it’s not only a tool for our key internal customers but also serves core functions for our analytics teams. Here are some of our key reasons for maintaining this process:

1) The intake portal is our single entry point for the constant deluge of analytics requests

If your experience as a data scientist is anything like ours, there is a constant din of questions accruing from email, verbal requests, and sporadic Slack messages¹. Requests are rarely as simple or straightforward as we – let alone non-technical stakeholders – would like them to be, so stepping away from a project to follow up on a seemingly innocuous inquiry can rapidly eat away at a data practitioner’s time. Just tracking this relentless barrage of questions quickly becomes a full-time job.

How I imagine someone thinks we answer a "quick question."
How I imagine someone thinks we answer a "quick question."

Initially, our request process was born out of necessity for the burgeoning data team to manage this growing influx of requests. Without a consistent landing ground and with only a few data scientists, it was easy for practitioners to get lost in the steady flux of small and large questions, each spawning a fractal of rabbit holes and follow-up analysis. It was difficult for leadership to see the full scope of what was being asked and who was asking it, prioritize requests, and what resources would be needed to support it. The queue gave us a place to shunt all asks to a single location where we could sort, triage, and track what the hell was going on.

This is still the primary function of our queue today – it captures all our new, unplanned, and/or unbudgeted work. No matter the scope, all requests can be redirected toward a single entry point. This empowers data scientists and analysts to protect their own time where needed², instead of having to sort through their ongoing workstreams, saddle a colleague with a request, or wait to talk to their manager. Analytics leaders can see the full scope of incoming requests, and prioritize them with context. Retrospectively, we take stock of the quantity and breadth of these requests, which helps us plan for future support. In other words…

2) It helps quantify latent analytics needs

The more requests we fulfilled, the more we received. Sometimes requests spawned further requests in the form of follow-up analyses or operationalized reports. Organic growth from word of mouth and some self-promotional plugs in slide decks and Slack channels increased circulation. New submissions funneled into a forum to triage and track requests, and our backlog ballooned as we attempted to win a never-ending game of analytics whack-a-mole.

The need was always there, but before we created the process we had no way to quantify that demand for additional analytics. Over time, efforts to estimate and project the effort required improved. Our understanding of the frequency and volume of requests from different corners of the business matured, and retrospectives of our process now help teams plan for the appropriate headcount to answer ad hoc needs without sacrificing support for new projects. In some cases, we can tie those numbers directly to the products or stakeholders with the most acute or recurring analytic needs, which means…

3) It helps us understand who our stakeholders are

Some business teams end up using the process more than others. Whether that corner of the organization has a deeper well of latent analytics needs, lacks its own analytics resources, or has a special interest in our data assets, it helps us to know who our biggest customers are. Keeping in close touch with these power users eases planning for support of the queue. In some cases, this comes in the form of a designated budget to support an increased volume of requests. It also makes space for data scientists to integrate and identify opportunities for new data products (read: dashboards) to meet recurring needs, aligning teams to common resources and metrics. More broadly…

4) It provides an opportunity for knowledge-sharing among business units

One of the compounding effects of a centralized intake process is connecting related requests from different corners of the business. Often a request may be directly answered by previous work products completed for someone else. Even if a new request isn’t exactly the same as a previous one, the first attempt at understanding stakeholder needs usually involves circulating related results in response to new submissions. This decreases our response time and belays the creation of new metrics when existing ones will do. A quick turnaround with related material will often meet user needs, and if the desire remains for additional investigation or new metrics, our data practitioners can leverage and augment existing data products.

Spotting trends in requests across units and knowing who is invested in feature definitions means we can prioritize and operationalize features that provide the greatest value across the business, maximizing the team’s impact. Finally, it also helps our different data teams connect and share content. In other words…

5) It creates opportunities for exchanging knowledge and best practices within analytics teams

In the same way that we rally stakeholders together with common feature definitions and a shared space for submitting requests, the process forms a focal point for connecting data practitioners. When ramping up new hires, we rapidly expose them to many areas of the business and germane data assets. An internal review process maintains quality and creates a contact surface for other practitioners and teams to track the new knowledge generated with each analysis. Finally, occasional pseudo-random assignment of new requests means that no individual need be pigeonholed into supporting any particular flavor of analysis. In mixing it up by design, we create a forcing function disseminating domain knowledge, techniques, tools, and code amongst individuals and teams.

What we learned to do (and not do) along the way

Our process has evolved substantially over the years to where it is today. The guidelines below highlight some of the dos and don’ts we’ve learned along the way, including what to ask, what to avoid, and what we don’t allow.

Photo by Clayton Robbins on Unsplash
Photo by Clayton Robbins on Unsplash

Don’t ask for metrics

We used to ask requestors some variant of what output/metric/data they need to be analyzed. But since most business requestors don’t have an intimate understanding of all the assets in our analytics platform, the answers were often confusing (what logs are they referring to?) or impossible (we just don’t capture that type of data).

Do ask stakeholders to describe the underlying problem they’re trying to solve

We want to get a high-level understanding of where the requestor is coming from, including what they are trying to solve, what decision they are making, or what issue they are troubleshooting. This requires a little more work from our teams to interpret the request but has proven to be much more fruitful than forcing stakeholders to directly articulate what to measure or what data to use. Sometimes stakeholders know exactly what they want and what data to analyze, and that’s great, it makes for an easy win. Occasionally they think they know exactly what they want, or what data to analyze, but they may be wrong. Knowing the underlying why allows us to scrutinize a request and propose changes if needed. More often than not, our data scientists and analysts are the ones who have a better understanding of our data (and its limitations), and part of our process involves proposing the right metrics, visualizations, or other data products to meet stakeholders’ needs.

Do ask them how they will use the output, or what decision will be made with the answer

This kind of prompt helps to further contextualize the requests, and get at the heart of "why" someone is asking for help. It not only makes for better metrics – getting at the root of a stakeholder’s need – but provides pivotal information for prioritization. We often infer the urgency and relative priority of a request from how the information is going to be used.

Don’t ask for a deadline

We used to ask in new submissions by what date a stakeholder needed results. This devilish calendar picker resulted in a lot of things due yesterday and mismanaged expectations. We’ve moved away from this (now optional) field, and will probably be deprecating it completely.

This doesn’t mean that deadlines don’t apply, but that requestors don’t set them by default. Since this process is for unbudgeted, unplanned work, it’s not intended to be a rapid-response on-call fire alarm. Even for non-urgent requests, stakeholders tend to pick what they think is "reasonable," but without any of the context to make that call (like other priorities in the queue, ongoing projects, complexities of the data, etc.) the timeline is arbitrary. Instead, we come back to requestors with our proposed delivery date after we triage new requests, which allows our teams to manage expectations without wrecking our sprints with constantly shifting priorities.

Do ask them what happens if they don’t get the answer (on time)

Similar to asking what is going to be done with the output, this helps to contextualize the need and drive prioritization. This is an even stronger tool for triaging and prioritizing requests. If someone can’t be bothered to justify what will happen without the analysis, they probably don’t need it right now.

Don’t use it for fire drills

Since the queue is for unplanned work and balanced against all the other incoming requests (plus our ongoing, committed project priorities), we avoid using it for extremely urgent requests. A small core team reviews new requests and pending status about once weekly, and a typical request can take anywhere from 1–2 weeks to complete once it’s been reviewed and resourced. Anything more urgent is left to teams with dedicated on-call staff or more appropriate processes.

Do broadly disseminate results when appropriate

I believe in a management philosophy including radical transparency, and that includes making results broadly available where reasonable. We catalog all our results internally, so we can re-share, recycle, or refresh results as needed. Sometimes a related analysis is more than enough to meet the stakeholder’s needs. In other scenarios whoever is picking up the request has an existing body of work to carry forward or a point-of-contact with domain expertise who worked with similar data assets.

Don’t use it to plan or scope project-level work

While our AHA process helps to scope the amount of latent, miscellaneous analytics demand against our team, we don’t use it for every project or process. The scope and size of the requests are limited to unplanned work, typically not connected to any Projects-with-a-capital-P that already have assigned analytics resources. It’s limited to something an individual data practitioner can handle. If it needs cross-functional support from multiple teams (say, putting some model into production), that’s project-level work requiring more formal planning. It’s also sufficiently scoped-limited so that a single practitioner could tackle it in no more than 1–2 weeks. Sometimes we’ll develop a prototype as part of an AHA request, and if the stakeholder is happy with the results it de-risks a larger effort to put something into production.

Continual Improvement

Our process is young and continues to evolve, and there is always room for improvement. Some things we’re still working on that we didn’t discuss here include:

  • Closing the feedback loop. Collecting feedback on the process is important, but we want to automate feedback collection for each individual request. This would give stakeholders a chance to comment on the work products and create more opportunities for process improvement. It can also help us understand the real impact of work products that make it into "production" (such as a decision, publication, etc.).
  • Making results discoverable. Today the outputs of each request and cataloged and managed internally by our AHA team. Connecting stakeholders to past outputs or searching through past analyses is a mostly manual process, and we want to move closer to a self-serve, searchable model for both practitioners and requestors. We’re in the process of migrating all our work products to a more user-friendly, searchable platform to increase discoverability and provide a better-defined surface for those one-off work products "in production."
  • Integrating more teams. We are looking at models for other practitioners to serve in "rotations" on the AHA team. With broader participation, we could support more requests and bring the same benefits described above to more teams in their unique domains. At the same time, we can strengthen our internal analytics community by developing a shared understanding of analytics needs across the business.

I continue to learn more every year as we refine our process, and hope these learnings help you in building your analytics functions. Let me know what’s worked for you.


[1] Over time, the first instinct for folks with a business question becomes "let’s ask the data team." This is a good problem, both flattering and exhausting.

[2] "Great question! Here’s a link to our request form…" is where some requests end. If it’s not worth 5 minutes to complete a form articulating the need, it likely isn’t worth a data scientist spending hours to get the answer.


The views expressed within are my own personal opinions and do not represent the opinions of any organizations, their affiliates, or employees.


Related Articles