Recursion is one of those concepts that I find myself relearning every time I don’t use it for a while. I remember generally what it is and why we need it, but after a long lay off, programming recursively is always a struggle.
So now that it’s once again fresh on my mind, let me document why and how we use recursion. If you would like to download the code used in this post, you can find it on my GitHub here.
Everyone who has studied recursion probably remembers the classic example of calculating Fibonacci numbers:
# Function that returns the nth number in the Fibonacci sequence
# For example, if n=3 the 3rd number in the sequence is returned
def fib(n):
if n==0:
return 1
elif n==1:
return 1
else:
return fib(n-1) + fib(n-2)
Let’s use it to understand how recursion works before we move on to a more practical example.
Recursion is when a function calls itself. I liken recursion to a do-while loop, which sadly doesn’t exist in Python. Unlike a for loop where you specify in advance the number of times you want something to run, a do-while loop runs until a terminating condition is met. The beauty of the do-while loop is that it runs until the job is done – you don’t need to know in advance how many times you need it to run.
Recursion is great because it’s a worthy replacement for the do-while loop. It also runs until the job is done. But instead of a loop, recursion is like the movie Inception. In Inception, Leonardo DiCaprio and his jolly band of dream thieves venture forth into ever more complicated layers of a person’s dream – where each layer is a dream within a dream. There’s no limit to how deep they can go, provided that they can find a way back.
Each time our recursive function calls itself, it’s effectively going one layer deeper (another dream within the current dream). But just like the characters in Inception, going deeper is a means to an end. Let’s check out our Fibonacci code, specifically the last line:
return fib(n-1) + fib(n-2)
Here the function is calling itself twice and adding the results together, but whereas the initial function is called with n as the input, these two calls are made with modified inputs: n-1 and n-2. So basically each dream within a dream whittles away at the initial input n (by subtracting from it) until it reaches its end. Let’s expand fib(5) to see how this works:
fib(5)
= fib(4) + fib(3)
= fib(3) + fib(2) + fib(2) + fib(1)
= fib(2) + fib(1) + fib(1) + fib(0) + fib(1) + fib(0) + 1
= fib(1) + fib(0) + 1 + 1 + 1 + 1 + 1 + 1
= 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1
= 8
Notice that the function keeps Incepting until it reaches fib(1) or fib(0), which are terminating cases. For example, fib(3) becomes fib(2) + fib(1), which becomes fib(1) + fib(0) + fib(1). This propagation of itself until a terminating case is reached is how the function goes deeper and deeper (into dreams within dreams) until it reaches an exit point.
We can verify that 8 is the correct answer by checking it against our function:
In: fib(5)
Out: 8
In:
# For fun, print first 10 values of the Fibonacci Sequence
for i in range(10):
print(fib(i))
Out:
1
1
2
3
5
8
13
21
34
55
Termination
Let’s spend some time and talk about the terminating cases. These are critical as without these cases, the function would just keep calling itself indefinitely and get stuck in an endless loop. So the first step of writing any recursive function is specifying the conditions at which it can exit (terminate).
In the case of Fibonacci, we know that that the first two values in the sequence are 1 and 1. That is, fib(0) = 1 and fib(1) = 1. So anytime we get an input of 0 or 1, then we can just return a 1 and call it a day. And if our input is greater than 1, then we need to work towards our terminating cases (of 0 or 1) by Incepting towards them via a sequence of recursive function calls like we did above.
A More Practical Use Case
Printing the Fibonacci numbers is a neat party trick, but now let’s go over a more practical use case – traversing trees. The problem with trees is that we don’t know how deep they go. You could have one like this:
root
|
|
b1 b2
Or a slightly deeper one like this:
root ____________
|
|
b1 ____ b2 b3
| |
l1 l2 l3 l1 l2
Let’s write a function that is generalized enough that it can handle both of these trees (or even more complicated ones). First let’s build the two trees we depicted above using Python dictionaries:
my_dict0 = {'root': {'b1': [1],
'b2': [2]}
}
my_dict1 = {'root': {'b1': {'leaf1': [1,2,3],
'leaf2': [4],
'leaf3': [5,6]
},
'b2': {'leaf1': [7,8],
'leaf2': [9,10,11]
},
'b3': 12
}
}
Now let’s think about how we might traverse these trees via recursion. First let’s define the terminating case. Because of the way we structured our dictionaries, the terminating case is pretty simple: if we see a list (or a single value) we can just append the contents and move on. Notice that I store the results in an expanding list called result (result is what I return when the function is done running). Also, note that though I call it a terminating case, we aren’t returning anything yet. Rather, we are just appending the end node’s value(s) to result and continuing on in our for loop as we know that we are done with this current subsection of our tree.
def parse_dict(my_dict, result):
for key, val in my_dict.items():
if type(val) == list:
for i in val:
result.append(i)
What if we don’t see a list? Then we need to dive deeper via recursion (Inception). The way we do that is by calling our function again but on a modified input (if we didn’t change our input each time we call, we would get stuck in an infinite loop). So instead of giving our function the original input, we now give it just the section of the tree (my_dict) that we are currently examining. For example, one recursive call to parse_dict might process just the b1 branch (which has 3 leaves: leaf1, leaf2, and leaf3) while another one would process just the b3 branch (which has just the value 12).
def parse_dict(my_dict, result):
for key, val in my_dict.items():
if type(val) == list:
for i in val:
result.append(i)
elif type(val) == dict:
parse_dict(val, result)
The last thing to add is a terminating case that checks for when the branch has no leaves or lists and contains just a single numerical value (such as b3). In this case we can directly append the value in val to result:
def parse_dict(my_dict, result):
for key, val in my_dict.items():
if type(val) == list:
for i in val:
result.append(i)
elif type(val) == dict:
parse_dict(val, result)
else:
result.append(val)
return result
If we run the function on the simple tree, my_dict0, we get:
In: parse_dict(my_dict0, [])
Out: [1, 2]
And if we run it on my_dict1, we get:
In: parse_dict(my_dict1, [])
Out: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Cool, we’ve traversed our trees and unearthed the values stored in each end node. And our function works regardless of whether the tree is a simple one or a more complicated, deeper one.
One thing to keep in mind is that if you have an extremely complicated tree with many layers and many branches, then traversing it recursively may take a really long time. If you are working with really complicated tree structures, then it makes sense to invest some time and thought towards building a more optimized solution.
Bonus: Getting The Path To Each Value
We might also want to know the path traversed to reach each end value (or the path to a specific end value). The following function produces a list containing the path to each end value (that shares the same ordering as parse_dict’s output). It works in a similar fashion to parse_dict so I won’t go into the details (but there are some comments in the code).
def get_path(my_dict, result, path):
for key, val in my_dict.items():
# If it's a dict, go one level deeper and update path
# to include the current key (node name)
if type(val) == dict:
get_path(val, result, path+[key])
# If it's not a dict, then append the key and don't
# go down another level (since no need to)
elif type(val) == list:
for i in val:
result.append(path+[key])
else:
result.append(path+[key])
return result
Here is some example output. Notice that we need to provide two inputs – result, which is returned, and path, which is an intermediate output that gets appended to result (but not returned at the end of the function).
In: get_path(my_dict1, [], [])
Out:
[['root', 'b1', 'leaf1'],
['root', 'b1', 'leaf1'],
['root', 'b1', 'leaf1'],
['root', 'b1', 'leaf2'],
['root', 'b1', 'leaf3'],
['root', 'b1', 'leaf3'],
['root', 'b2', 'leaf1'],
['root', 'b2', 'leaf1'],
['root', 'b2', 'leaf2'],
['root', 'b2', 'leaf2'],
['root', 'b2', 'leaf2'],
['root', 'b3']]
Hope this was helpful. Cheers!
More Data Science Related Posts By Me:
Understanding Cross Validation
What In The World Are QQ Plots?