Progress bars for Python with tqdm

Track the execution of Python iterations with a smart progress bar

Doug Steen
Towards Data Science

--

Not long after I began working on machine learning projects in Python, I ran into computationally-intensive tasks that just took a long time to run. Usually this was associated with some kind of iterable process. A couple that immediately come to mind are (1) running a grid search on p, d, and q orders to fit ARIMA models on large data sets, and (2) grid searching hyperparameters while training machine learning algorithms. In both cases, you can potentially spend hours (or more!) waiting for your code to finish running. Desperate for some kind of indicator to show the progress of these tasks, I found tqdm.

What is tqdm?

tqdm is a Python library that allows you to output a smart progress bar by wrapping around any iterable. A tqdm progress bar not only shows you how much time has elapsed, but also shows the estimated time remaining for the iterable.

Installing and importing tqdm

Since tqdm is part of the Python Package Index (PyPI), it can be installed using the pip install tqdm command.

I tend to work often in IPython/Jupyter notebooks, and tqdm provides excellent support for this. To begin playing with tqdm in a notebook, you can import the following:

Examples

For the sake of clarity, I won’t get into a computationally-intensive grid search in this post — instead I’ll use a few simple examples to demonstrate the use of tqdm.

For-loop progress

Let’s say we wanted to simulate flipping a fair coin 100,000,000 times while tracking the results, and we also wanted to see how long these iterations will take to run in Python. We can wrap the tqdm function around the iterable (range(100000000)), which will generate a progress bar while our for-loop is running. We can also assign a name to the progress bar using the desc keyword argument.

The resulting tqdm progress bar gives us information that includes the task completion percentage, number of iterations complete, time elapsed, estimated time remaining, and the iterations completed per second.

In this case, tqdm allows for further optimization by using trange(100000000) in place of the tqdm(range(100000000)).

Nested for-loop progress

If you have a situation that calls for a nested for-loop, tqdm allows you to track the progress of these loops at multiple levels. For example, let’s take our coin-flip example, but this time we want to play three separate “games” of 10,000,000 flips each while tracking the results. We can create a tqdm progress bar for “Overall Progress”, as well as progress bars for each of the three games.

Pandas Integration

A slightly different implementation of tqdm involves integration with pandas. tqdm can provide additional functionality for the .apply() method of a pandas dataframe. The pandas .progress_apply() method must first be ‘registered’ with tqdm using the code below. Then, the .progress_apply() method is used instead of the traditional .apply() method — the difference is, we now have a smart progress bar included in the method’s output.

Processing Dataframe: 100%|██████████| 1000/1000 [00:02<00:00, 336.21it/s]

Additional tqdm integrations

In addition to being integrated with IPython/Jupyter and pandas, tqdm offers integration with Keras and experimental modules for itertools, concurrent, Discord, and Telegram. This post only scratches the surface of the capabilities of tqdm, so be sure to check out the documentation to learn more about how to include smart progress bars in your Python code!

References

--

--