Calculating Internal Rate of Return (IRR) in BigQuery

A step-by-step guide to implementing Excel’s IRR function in BigQuery

Bilal Khan
6 min readJun 16, 2021

Internal Rate of Return (IRR) is a common calculation that often comes up in finance. In this blogpost, I will show how to build a query in BigQuery that carries out the calculation equivalent to Excel’s IRR function. This is helpful if the cashflows are stored in a BigQuery table and you want to calculate IRR without moving the data to Excel, or if you want to use BigQuery as a source for data visualizations and BI.

Photo by Scott Graham on Unsplash

1. Background: Internal Rate of Return (IRR) and Net Present Value (NPV)

Internal Rate of Return (IRR) is the rate of return at which the Net Present Value of all cashflows resulting from an investment equal zero. It primarily serves as a decision rule: the higher the IRR of a series of cashflows from an investment and the greater the amount by which it exceeds the cost of capital, the more profitable the investment.

Mathematically, IRR is the value of r that solves the equation:

This is the same as:

Here, t is the time period (from 0 to N), CF is the cashflow at period t , and r is the rate of return (IRR).

The above equation does not have an analytical solution. In other words, there is no mathematical formula that can give us the Internal Rate of Return. The only way to solve it is numerically, by trying a number of values of r and seeing which one comes closest to satisfying the equation.

For any given value of r , the left-hand side of the above equation gives us the Net Present Value:

2. Example IRR calculation using Excel’s IRR function

We will work with the following simple example for the rest of this guide. Let’s assume that we have the following cashflows stored in the file irr.csv

PERIOD, CASHFLOW
0, -10000
1, 2000
2, 6000
3, 6000

We can use Excel’s IRRfunction to calculate the Internal Rate of Return. Excel’s documentation describes the IRR function as follows:

Returns the internal rate of return for a series of cash flows represented by the numbers in values. These cash flows do not have to be even, as they would be for an annuity. However, the cash flows must occur at regular intervals, such as monthly or annually.

Calculating IRR in Excel is fairly straightforward:

Our goal is to implement this in BigQuery.

3. Net present value

a. As a first step, let’s calculate the Net Present Value of the above cashflows in BigQuery. We can easily importirr.csv data to a BigQuery table using the BigQuery console. In this guide, the resulting table is namedtutorial.irr.cashflows

b. Once the table is created, we can run the following query which gives returns an additional column containing the present value of each cashflow:

SELECT
period,
cashflow,
cashflow/POWER(1+0.16, period) AS present_value
FROM
`tutorial.irr.cashflows`
ORDER BY
period

BigQuery outputs the following query results:

c. Net present value is simply the sum of the present value of cashflows. Therefore, we can use the SUM aggregate function to get the Net Present Value of cashflows:

SELECT
SUM(cashflow/POWER(1+0.1614, period)) AS net_present_value
FROM
`tutorial.irr.cashflows`

BigQuery outputs the following query results:

The Net Present Value is close to zero which verifies the result we got from Excel that 16.14% is the approximate value of the IRR.

4. Internal Rate of Return

a. Now that we know how to calculate the Net Present Value, our goal is to find the Net Present Value for a range of values of r and select the one closest to zero. To generate a range of values for r, we will make use of BigQuery’s GENERATE_ARRAY function. Here is a brief explanation of the GENERATE_ARRAY function from BigQuery’s official documentation:

GENERATE_ARRAY generates an array of values from a starting and ending value and a step value. For example, the following query generates an array that contains all of the odd integers from 11 to 33, inclusive:

SELECT GENERATE_ARRAY(11, 33, 2) AS odds;

+--------------------------------------------------+
| odds |
+--------------------------------------------------+
| [11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33] |
+--------------------------------------------------+

b. To illustrate how we will find the IRR, let’s consider a small set of values of r [0.14, 0.15, 0.16, 0.17, 0.18, 0.19] as guesses for IRR. (We will consider a larger range of values later. For now, assume we know that IRR lies in the above range. We know from our previous calculations above that IRR is closest to 0.16.) The following query generates the Net Present Value for each of these guess values:

SELECT
guess,
SUM(cashflow/POWER(1+guess, period)) AS net_present_value
FROM
`tutorial.irr.cashflows`
CROSS JOIN
UNNEST(GENERATE_ARRAY(0.14,.19,0.01)) AS guess
GROUP BY
guess

We can see that the Net Present Value in the results table is closest to zero when the guess is equal to 0.16.

c. Our next task is to write a SQL that gives the value of r for which the Net Present Value is closest to zero (i.e., IRR). The following query accomplishes this by using a common table expression:

WITH
npv_table AS (
SELECT
guess,
SUM(cashflow/POWER(1+guess, period)) AS net_present_value
FROM
`tutorial.irr.cashflows`
CROSS JOIN
UNNEST(GENERATE_ARRAY(0.14,.19,.01)) AS guess
GROUP BY
guess)
SELECT
guess AS IRR
FROM
npv_table
WHERE
ABS(net_present_value) = (
SELECT
MIN(ABS(net_present_value))
FROM
npv_table)

BigQuery outputs the following query results:

d. Now that we know how to calculate the IRR, we can consider a larger range of values of r in the GENERATE_ARRAY function with the following query:

WITH
npv_table AS (
SELECT
guess,
SUM(cashflow/POWER(1+guess, period)) AS net_present_value
FROM
`tutorial.irr.cashflows`
CROSS JOIN
UNNEST(GENERATE_ARRAY(0,2,.0001)) AS guess
GROUP BY
guess)
SELECT
ROUND(guess*100,2) AS IRR_percentage
FROM
npv_table
WHERE
ABS(net_present_value) = (
SELECT
MIN(ABS(net_present_value))
FROM
npv_table)

BigQuery outputs the following query results:

This is the exact same value that we got from Excel’s IRR function. (The IRR value has been formatted as a percentage to two decimal places in the final query.)

Thanks for reading! Please share feedback in the comments.

--

--