
No, it’s not a spelling mistake… Memoization is a thing.
When I first learned about this technique, I was struck by a sense of extreme beauty. A feeling I knew only from the field of mathematics where you get a sense of higher-structural beauty.
Memoization did that to me because I quickly realized that it is a specific case of a more general phenomenon in computing. The big secret is that memory and execution speed are two sides of the same coin.
Not only was I able to run my code much faster, but it opened my eyes to a more nuanced view of Programming as a whole.
In this article, we will implement it as a simple solution in Python.
Speed vs Memory
Grab a cup of coffee and your laptop and – oh – you might wanna sit down for this one!
In programming, there are three kinds of speeds.
There is the raw power of the programming language which has nothing to do with the design of the program but is rather a built-in feature of the language itself. Go is for instance faster than Java and much faster than Python while C++ and Rust are faster than Go.
Then we have another kind of speed. The effectiveness of the algorithms. Also called "the complexity". Go and Rust are fast, but if you chose the wrong algorithm, then it really doesn’t matter how fast your programming language is – it will be slow.
We’ll actually see an example of this in a minute.
Thirdly, we have "speed of development". This is in many cases more important than the small speed gains of the language. For example, Go and its quick compiler beats Rust in this metric even though Rust is faster in the above sense while Python beats almost all other languages because of Python’s gazillion libraries and the huge community surrounding it.
Sometimes the raw performance of the language is the most important thing though, and then we shouldn’t (necessarily) choose Python but pure Python can be much quicker than Rust if you use the right algorithm in Python and the wrong algorithm in Rust.
In general, memory and speed are related in a little like the same way that energy and mass are in physics. They are two sides of the same coin in the sense that you can convert one of them into the other.
That is, you can increase speed (or equivalently, reduce time) by increasing memory usage "the right way". This is called memoization.
Memoization is a way to lower a function’s time cost in exchange for space cost. That is, memoized functions become optimized for speed in exchange for a higher use of computer memory space.
The time/space "cost" of algorithms has a specific name in computing: computational complexity. All functions have a computational complexity in time and in space.
-Wikipedia.
So let’s see how we can use the idea of trading memory for speed in Python to optimize the code.
Memoization in Python
Say we want to calculate Fibonacci numbers. Who wouldn’t? Moreover, we want to implement an elegant recursive algorithm in Python.
Let’s try that.
We get 610 in an instant. What’d you know? It works!
However, when we are back in our chair with a new cup of coffee and then try to calculate the 100th Fibonacci number… Nothing happens!!!
Not a single output for a LOOOOONG time. What is wrong here? We just saw a lightning-fast result before.
Well, this algorithm actually has exponential complexity, meaning the runtime grows exponentially as a function of the input.
That is not good.
The reason is that each call to the function fib
needs to run itself on all the numbers down to 1 – each function! And we call fib
many many times. This means that we calculate fib(2)
insanely many times for example.
This can of course be optimized.
Consider the following code, where we use a dictionary to hold cashed output values.
We get the output 354224848179261915075. It works.
However, we don’t want variables hanging out in global scope, right? It would be much nicer if the memory
dictionary was cleared when the function finished executing.
Well, we need to have the cashing dictionary in some local scope that dies with the function. This sounds like a decorator problem.
Consider this.
This prints out the above number just as fast, but this time all the data is cleared after the function has returned. We have wrapped the cashing around the function so to speak and that makes Python run this calculation millions of times faster!
We could of course build a fast Fibonacci function without memoization techniques but that would require a loop instead of recursion and recursion is really nice from a logical point of view.
Recursion can be much simpler, more readable and, in fact, beautiful… And who doesn’t want a combination of simple, beautiful and fast? That being said, we should be a little careful with recursion in Python because we can easily hit a stack-overflow meaning that the call stack becomes too large.
Python has a relatively low stack limit even though you can modify it.
In general, recursion is more of a functional approach, since in (true) functional languages you don’t have loops, because loops need you to mutate variables and functional languages can’t have that.
Python is an object-oriented language and if you need to go deep, then use loops! Especially if you don’t know how deep you are going. Maybe the debt of the recursion will be user-defined (if you are building a recursive file crawler for example).
It would be annoying to get a stack-overflow error that you didn’t predict because you tested it on a shallower file system/folder than a user.
By the way, how would you create this fib
function using a loop?
Okay… I have tortured you enough, I guess. The word memoization is derived from the Latin word memorandum which means: "to be remembered".
Luckily, you don’t need to remember this in order to use it!
Enjoy reading articles like this one on Medium? Get a membership for full access.