
Introduction
Simple is sometimes better. Keeping things simple helps you focus more on things that matter the most, the value. You can always use fancy stuff (a better algorithm, machine learning, etc), but when you use them, it’s quite tempting to also try a bunch of irrelevant stuff that won’t increase the value of your work by that much. Of course, you will also need to maintain the quality of your solution.
In this article, I want to show you cool stuff that you can achieve using a simple solution by utilizing Bigquery. You only have to get yourself familiar with some usable BigQuery analytic functions. If you haven’t check out my other post about it.
We will be using public datasets, but the problem statement should be relatively the same across similar businesses. With that in mind, I hope you can have just another inspiration to tackle similar problems your organization is facing. Let’s jump right into it!
Problem statement
I just made this up for the sake of the example, as simple as possible.
You are working for a company that is selling something to stores (b2b). Your company has some people working as an offline sales team that will reach out to these stores as potential leads.
You already have some lead data available in your database, but you have to somehow prioritize them in such a way that it will optimize the cost and time needed for the sales team to reach them.
I will be using the new_york_citibike.citibike_stations public dataset that is available in BigQuery. For this example, I will change the station id to the store id.

Simple Solution
We can optimize this sales operation by using a simple approach, let’s break them down.
Prioritize by region
As you already notice in the dataset, the stores are located in several regions, therefore it’s reasonable that we optimize the routing by regions. You can also see the field capacity
in our database, let’s pretend it’s a number that shows how many potential products we can sell to them.
Our logic in this step is quite simple: select stores with most capacity in each region
.

Now we have top leads for each region that our offline sales team should visit first. On to the next steps.
Select next leads to visit
Next, we have to decide which leads our offline sales team should visit next. It’s quite straightforward, just select the next lead that is closest in distance from the previous lead.
explanations:
- We create a base CTE consisting of only the data that we need. We also give it a rank based on the capacity as we did in our previous step. Also notice that we convert the coordinates into Geogpoint objects (read this if you want to know more about it).
- We create a top_leads CTE to collect only the top leads based on capacity.
- Finally, we calculate the distance between the top leads and each of the leads by using inner join by region. One final touch is to sort them by the calculated distance (in meters).

Leads assignment
One last step is to assign some leads to each salesperson. The number of assignments may depend on the number of the offline sales team, but let’s say each of them can handle ~5 leads per day (1 offline visit).
So our step is just to break down our leads to match the number. Select top 5 leads, assign them to the salesperson and continue until all leads are assigned.
But there is one important step to make sure the quality of the work that we deliver is decent enough. Let’s say these 5 leads were assigned to salesperson A:

So far our solution only calculates distance without considering the actual road. Therefore, each time a salesperson wants to visit the leads, they need to adjust their route accordingly in Google Maps (in reality, they will probably use Google Maps or another type of navigation anyway, so this won’t increase their work that much).
First, just input all destinations:

Second, adjust the route based on the road:

Note: this manual adjustment depends on country, location, road type, etc. It is also up to the salesperson on how they want to decide the visit, but at least we help them to identify which 5 nearest locations they need to visit in a day.
Link to the example above: HERE
Conclusion
We’ve utilized some useful BigQuery functions (analytic and geography) to help us deliver a simple solution to the business problem. Now it’s up to you to communicate the solution to your stakeholders or business person. The Simplicity of your solution might help you a lot in explaining how you get this done!
In my day-to-day activity as a data practitioner, I’ve worked on many of these kinds of simple yet valuable solutions. I hope by sharing this with you the readers might inspire you also to focus more on what matters the most.
As always, happy learning!! 🚀🚀