Notes from Industry

How I helped a VC automate its portfolio performance monitoring

By setting up a lightweight ELT project using Google stacks

Wenling Yao
Towards Data Science
7 min readMay 7, 2021

--

Photo by Markus Spiske on Unsplash

Aside from my work, one interesting side project I’ve worked on is to help one investor of my current employer automate the performance monitoring of its portfolio companies.

Data, automation, performance monitoring —these should sound like an easy job in a world where “data is the new oil” is the new mantra and we have vast choices of data integration platform vendors on the market. However, these solutions may be too much for such a special use case. We don’t really have big data here, and data latency is not on top of our heads.

The real challenge we were facing is to reduce the manual labor in the process of compiling, cleaning, and analyzing 200+ Excel sheets we receive from 30+ portfolio companies on a regular basis every year while at the same time ensuring consistency and accuracy. Our end users are also not very data-savvy (yet), so the tools we choose should not have a steep learning curve. We also don’t have any engineering capacity to build and manage complex data pipelines. Therefore, what we are looking for here is a really lightweight ELT solution (Extract, Load, and Transform) with dashboarding functions. Fortunately, our investor already uses G-suite as their office stacks and Google provides a wide range of analytics services to choose from. With some creativity, together with my investor, we designed a workflow that looks like the following:

  1. We invite portfolio companies to share their monthly key performance data by filling out a well-designed, standardized Google Form — this is the “E” part.
  2. Connect the Google Form to a Google sheet, so that the answers will be compiled automatically to a performance dataset — this is the “L” part.
  3. Create some auxiliary datasets to collect additional static facts that we can link with the performance dataset so that we have more interesting dimensions (e.g., company HQ, funding status, investment year, etc) to slice & dice in our further analyses.
  4. Add datasets to Data Studio as data sources, join these datasets, create calculated fields and metrics, and eventually build automatic, interactive dashboards to drive our decision making — this is the “T” part.

The result? The investor adopts this solution for their portfolio performance monitoring since then. Guess what? In their recent job posting for associates, they put up a requirement: “Nice-to-have: you have some experience with modern Business Intelligence solutions (e.g., Google Data Studio, Tableau, Power BI)”. 😉

This is definitely not a big data story. It does not involve any modern data stacks such as Snowflake and Redshift. It is also not about developing an advanced machine learning model to solve a very complex real-life problem. Nevertheless, this project means a lot to me because it reveals that there are many possible ways to create value from data other than big, advanced technical projects. A small step in my eyes can actually create real, tremendous business value to others who are yet not as data-savvy as me and inspire them to build a strong data culture in their organizations.

If you are interested, you can find below a template pipeline that I have set up using some fake numbers. If you have a Google account, you can copy the folder and the Google Data Studio Report to your own Google drive and start playing around with it. I also share some useful tips for each step.

Here are the links to the templates. Have fun!

Design a Google form to collect your data.

There are three things I like about Google Form:

  • One does not necessarily need a Google account to fill out a Google Form.
  • You can set up data validation rules for each question so that you don’t have to spend time on manual sanity checks for typos, digits, etc.
  • You can connect a Google Form to a Google Sheet and the answers will flow automatically to that sheet.

Bear in mind that each question in your Google Form will enter the Google Sheet as a column. Therefore, make sure you design your questions in a smart manner!

It can be a bit tedious if you want to add/change questions in the Google Form, which can result in duplicate or mismatched columns in the destination Google Sheet. Therefore, I would recommend testing and improving your Google Form as much as possible before sharing it with your portfolio companies!

The last reminder: try not to do any modifications in the destination Google Sheet, e.g., create a month column based on the date or link to another sheet to get a new column. These transformations can all be done centrally in Google Data Studio, in a much easier and cleaner manner. The advantage of keeping all transformation in one central place is obvious: we don’t have to go to multiple places to check and change the dimension and metric definitions.

This is a list of questions in our monthly performance monitoring information sheet:

Monthly Portfolio Performance Monitoring (Part 1)
Monthly Portfolio Performance Monitoring (Part 2)

This is how the destination Google Sheet looks like:

Image by Author

I chose these questions in my template because we can derive most KPIs that can be used to assess the performance of startups in all industries. Some examples are:

  • Customer acquisition: #New Customers (and YoY Growth)
  • Retention = 100%-#Total Churned Customers/#Total New Customers
  • Financials: Variable Costs (Revenue-Gross Margin), Fixed Costs (Gross Margin-EBITDA), Profit Margin (Gross Margin/Revenue)
  • Unit economics: CLV/CAC
  • Operational Efficiency: #Total Customers per FTE, Costs per FTE

Create some auxiliary datasets for more dimensions in your final report.

These auxiliary datasets should be static facts — things that don’t change or grow that often.

In my example there is one company fact sheet that looks like the following:

Image by Author

With a common column “Company Name” we can join this fact sheet with the monthly performance datasheet, and use additional dimensions such as Short Name, Country, Fund Cohort, and Funding Status as filtering and grouping columns in Data Studio Report. Also, remember to select all dimensions and metrics you would like to use from both tables so that they show up in the final blended table.

Image by Author

For example, we can add these filters to each slice of the dashboard and apply them to all charts on one dashboard. We can also have an overview of our portfolio companies per country and region.

Image by Author

Create automated, interactive dashboards on Data Studio.

There are many useful tutorials on how to create reports on Data Studio from Google Sheets, and I would always recommend this online course from Google.

Here I would only share one tip that I found very helpful with my project: Remember to set up a “Data range” column in your data source and use it in the charts for date range filtering. Usually, it should be the reporting month's start date). This is very critical in our case because in the destination Google Sheet we have for one KPI both the “Last Reporting Month” column and “YTM cumulative” column, and we always have to create metrics based on these columns (e.g., SUM(Revenue (LRM), SUM(Revenue (YTM)) to build charts. We have to make sure that we filter for last month so that we don’t show a sum of all monthly revenue when we actually just want to show the revenue for the last reporting month.

Image by Author
Image by Author

When it comes to how to build intuitive and easily digestible dashboards, I applied many principles that I shared in my previous post How I create an Analyst Style Guide. Feel free to have a check!

I hope this post can give you some inspiration if you are facing a similar challenge and looking for a lightweight, no-code BI solution.

Any thoughts and questions are welcome!

--

--

INSEAD MBA 22J | Business Intelligence | Fintech | Realist | Explorer | Make well-informed decisions every day.