The world’s leading publication for data science, AI, and ML professionals.

Temporary Variables in Python: Readability versus Performance

Temporary variables can make code clearer. What about the performance of such code?

PYTHON PROGRAMMING

Are Python shortcuts fast? Photo by Stefan Steinbauer on Unsplash
Are Python shortcuts fast? Photo by Stefan Steinbauer on Unsplash

Temporary variables are variables with short lifetimes:

Temporary variable – Wikipedia

They are used very often in programming, and you don’t have to know this term to use temporary variables. One of the most common use cases is to make code clearer, for instance, in pipelines:

input → tempvar_1 := func_1(input) →
        tempvar_2 := func_2(tempvar_1) →
        func_3(tempvar_2) → output

Here, I used Python’s walrus operator to visually represent an assignment, just the way it’s used in Python code. In this pipeline, we have two temporary variables: tempvar_1 and tempvar_2. Their lifetime is short in terms of data flow through the code, although it can be long in terms of real time. tempvar_1 is used for only one purpose: to pass the results from the first step of the pipeline to the next. But note that technically, it’s unnecessary:

input → func_3(func_2(func_1(input))) → output

Both versions will work the same way, although the latter can be much less readable. Thus, the former version is used a lot in programming, with the only reason to make the code cleaner.

Note that had tempvar_1 or tempvar_2 been used later on in the code, it wouldn’t have been a temporary variable, since it wouldn’t have had a short lifetime. For the sake of simplicity, we can assume that a temporary variable is one you use just once, in order to pass the output of one callable as input to another.

Have you ever pondered if using temporary variables in a pipeline constitutes a better option than a direct – and shortest – way of calculating the pipeline? As in, which of the two following snippets is better?

# snippet 1
third_function(second_function(first_function(x)))

# snippet 2
x1 = first_function(x)
x2 = second_function(x1)
x3 = third_function

Or, this time using simple arithmetics:

# snippet 1
x1 = 2.056 * x
x2 = x1 / (1 + x1)
y = 2.3 / (- x2 - 7.33)

# snippet 2
y = 2.3 / (- 2.056 * x / (1 + 2.056 * x) - 7.33)

Which would you choose? Does it even matter?


Python is so popular for various reasons, and one of them is the readability of its code. At the same time, Python is known for its poor performance – although it isn’t as bad as many claim, as I wrote in the following article:

The Speed of Python: It Ain’t That Bad!

Oftentimes, you can – and need to – choose between readability and performance. Sometimes you may need even the slightest improvement in performance, even if it means decreased readability. Other times, a small improvement in performance means no side effects and as readable and comprehensible code as its slower counterpart; why wouldn’t you go for it?

When an improvement in performance comes at some cost, however, you should be careful. You should ask yourself – or the development team you’re part of – the following question: Is this minor improvement in performance worth decreasing Code Readability?

In this article, I want to show you an example of such an improvement, achieved by avoiding temporary variables. Getting rid of them can improve performance a little bit, but usually at a cost of decreased readability. Yes, usually, so not always: if you’re lucky, getting rid of temporary variables can help you improve both performance and readability. A perfect situation, isn’t it?

Temporary variables in Python code

Imagine you want to implement a function that calculates a sequence of things. For simplicity, we will perform some basic arithmetic calculations so that the example is simple. In real life, however, such pipelines can contain several functions doing various things, even quite complicated.

def calc_with_tempvar(x):
    y = x**2
    z = y/2
    f = z + 78
    g = f/333.333
    return g

So, we’re starting with x, and then we calculate y, z, f and finally g, g being the final output, so it’s returned. This is similar to _function composition_, the difference being that here we don’t compose functions but calculations. In many scenarios, however, you will have actual functions; for instance, instead of y = x**2, you can have y = some_function(x). A perfect example of when something like that works in Python is generator pipelines:

Building Generator Pipelines in Python

and their general version, comprehension pipelines:

Building Comprehension Pipelines in Python

In simple situations, like the one in our calc_with_tempvar() function, such an approach seems to be an overkill. Instead, we could simply do the following:

def calc_without_tempvar(x):
    return ((x**2)/2 + 78)/333.333

Both lead to the very same results, as these tests show:

>>> for x in (1, 2.3, 0.0000465, 100_000_000.004):
...     assert calc_with_tempvar(x) == calc_without_tempvar(x)

No output means this is true indeed.

First, let’s disassembly the two functions, to see their translations to Python bytecode:

Disassembly of the two functions using the dis.dis() function. Image by author
Disassembly of the two functions using the dis.dis() function. Image by author

Even without analyzing the bytecode of the two functions, we see that Python has a more complex job to do with the function employing the temporary variables than with the one without them. This does not come as a surprise, does it? The way the functions are defined suggests itself that calc_with_tempvar() will have to do more to reach the outcome than calc_without_tempvar().

Temporary variables: Performance

How does this translate into performance, however? To learn this, let’s use the [perftester](https://github.com/nyggus/perftester) Python package, dedicated to benchmarking and testing time and memory performance of Python functions:

Benchmarking Python Functions the Easy Way: perftester

For benchmarks, I used Python 3.11 on a Windows 10 machine, in WSL 1, 32GB of RAM and four physical (eight logical) cores. However, in our case, raw times do not matter that much; we will focus on relative comparisons instead.

First, let me change the default settings for the benchmarks. I will use 20 million function calls repeated 7 times; the quickest among the 7 times will be chosen as the benchmark result.

>>> import perftester
>>> perftester.config.set_defaults(
...     "time",
...     Number=20_000_000,
...     Repeat=7,
... )

Now the actual benchmarks, for a float number:

>>> x = 1.67
>>> t1 = perftester.time_benchmark(calc_with_tempvar, x)
>>> t2 = perftester.time_benchmark(calc_without_tempvar, x)

And let’s see the results¹:

>>> perftester.pp({
...     "1. composition": t1["min"],
...     "2. one-shot": t2["min"],
...     "3. composition--to--one-shot ratio": t1["min"] / t2["min"]
... })
{'1. composition': 2.063e-07,
 '2. one-shot': 1.954e-07,
 '3. composition--to--one-shot ratio': 1.056}

As expected, the one-shot version (without temporary variables) is faster – around 5% faster. On the one hand, it’s not much. On the other hand, it’s 5% achieved by something that small – so small a change!

The above calculations are fast. For long calculations, however, the difference would likely be close to invisible.

Did you notice that we can improve the calc_with_tempvar() function a little bit? Do we need the last object, g? Sometimes an object like this can increase a function’s readability, via a good name, but not in this case – so we don’t need g. Let’s see if getting rid of it will help us in terms of performance:

def calc_with_tempvar_shorter(x):
    y = x**2
    z = y/2
    f = z + 78
    return f/333.333
>>> t3 = perftester.time_benchmark(calc_without_tempvar_shorter, x)
>>> t3["min"]
1.998e-07

A minor improvement, as the composition version is about 1.032 times slower than this one, and this one is about 1.023 times slower than the one-shot version. But again, this improvement was achieved by such a small change! If so, isn’t this small change worth using?

Conclusions

For me – it definitely is, but not always.

The point is, when performance does not matter, go for readability. If it really does not change a thing when your program runs a minute, 10 seconds, or even half a second longer – simply don’t even think about improving performance by such tricks. Why should you? Why should you worsen readability in order to make minor improvements when these improvements do not matter whatsoever? Just go for readability.

Of course, there will be times when getting rid of temporary variables will increase the readability of the function. In this case, why should we even discuss this? Again, just go for readability, and when this means increasing performance, too – well, a perfect situation.

Sometimes performance does matter. Even a split of a second can make a difference. If this is the case, you should profile your code and find bottlenecks. Other times, you may wish to optimize every single part of the code. One example is working on a framework to be used by others, and for some of them performance will matter. In that case, it’s your responsibility – as of the framework’s author – to offer as fast a tool as possible. Otherwise, you’re risking that some users will not use your framework.

To summarize:

  • If performance matters, avoid using temporary variables like those in calc_with_tempvar(). If performance is of secondary (if any) significance, go for readability – which means that a decision whether to use temporary variables or not should be based solely on code readability.
  • It’s not that temporary variables always increase readability. For instance, imagine you have a mathematical function y(x) = ((x**2)/2 + 78)/333.333. Do you think calc_with_tempvar(), with all those temporary variables, would increase readability? I don’t.

So, sometimes temporary variables will improve code readability, other times they won’t. If performance is of critical importance, remember that temporary variables can add some minor overhead. More often than not, this overhead will be negligible – but in some projects, even those splits of seconds can matter.

All in all, always double-check if it’s worth getting rid of temporary variables in your code – or if it’s worth using them.

Footnotes

¹ The code uses the perftester.pp() function, which pretty-prints (using a standard-library function pprint.pprint()) a Python object with all numbers in it rounded to four significant digits. It does so using the rounder package:

GitHub – nyggus/rounder: Python package for rounding floats and complex numbers in complex Python…


Thanks for reading. If you enjoyed this article, you may also enjoy other articles I wrote; you will see them here. And if you want to join Medium, please use my referral link below:

Join Medium with my referral link – Marcin Kozak


Related Articles