
I recently did a coding interview where one of the passing considerations was how optimized the code was. Unfortunately, I failed the test not because the code did not work (which it did) or the logic was wrong (which it was not) but because my code was not optimized.
As a data scientist, efficient code:
- Saves time when processing and analyzing large or complex data.
- Promotes scalability, ensuring your datasets can handle larger datasets and complex models.
- Creates reusable and modular code, which again saves time and reduces errors.
- Is easily maintained and updated because it’s simple and understandable.
- is shareable with a broader audience because it can work on less powerful hardware.
Optimized code == efficient code
In Python, efficient code is:
- Pythonic – It uses python’s unique style and idioms in the way it was intended by the founders and the community.
- Readable -It’s easy to read and understand what the code does. For example, following correct naming conventions, being mindful of white space, and using fewer lines of code where possible.
- Fast – Should run in the least possible time, consuming minimum amounts of memory and resources.
Companies and employers prefer optimized code that can easily scale and allow new developers to get on board quickly.
In this article, we’ll cover four python magic commands that test how efficient our code is. We’ll also perform tasks using different code approaches and measure the most efficient method.
Magic commands – These are special python commands that start with %
or %%
and are supported in Jupyter notebooks and the Ipython kernel. They provide a quick and powerful way to perform tasks such as timing code (discussed in this article), displaying visualizations, and navigating directories.
Line magics: These have a single % and operate on one line of input. Cell magics: These have two %% and operate on multiple lines of code or a cell block.
Note. You may be familiar with the ‘!’ symbol, which is a short form of the magic command %system
. This command executes shell commands directly in the notebook, such as installing packages using !pip install package
.
To display all built-in python magic commands, use %lsmagic
.

To find out what a magic command does, use the code _%magiccommand? to display its documentation in place.

1. %timeit
This magic command measures the time it takes for a single line of code to execute. It runs the code several times and returns the average execution time.
%timeit syntax: The command is followed by the code to test, all in one line.
%timeit code_to_execute
Example output
34.6 ns ± 1.17 ns per loop (mean ±
std. dev. of 7 runs, 10000000 loops each)
Output explained:
- 32.4 ns = The average execution time. Use the table below for time conversions.
- 1.17 ns = The standard deviation of the measurements.

- 7 runs = The number of runs, or iterations, to repeat the process. We have different runs to account for variations in factors such as memory usage and CPU load, which may remain the same in one run but differ in others.
- 10,000,000 loops = The number of times to execute the code per iteration. Therefore, the code runs a total of
runs*loops
times.
The number of runs and loops are automatically determined based on the code complexity, but you can also pass them as arguments, as discussed below.
Task 1: Timing one line of code – compare [] vs list() for instantiating a list.
Creating a list using literal symbols []
%timeit l1=['sue','joe','liz']
###Result
34.6 ns ± 1.17 ns per loop (mean ±
std. dev. of 7 runs, 10000000 loops each)
Creating a list using list()
%timeit l2=list(['sue','joe','liz'])
###Result
92.8 ns ± 1.35 ns per loop (mean ±
std. dev. of 7 runs, 10000000 loops each)
Output explained: Using the literal symbols takes 34.6 ns, less than half the time when using the function name (92.8 ns).
Therefore, when instantiating a python list, tuple, or dictionary, it is more efficient to use their literal symbols than their function names.
#Efficient
lst = []
tup = ()
dct = {}
#Not Efficient
lst = list()
tup = tuple()
dct = dict()
The same can be said when creating a list of numbers using the range function. List unpacking using *
is more efficient than using the list()
name.
#Efficient
lst = [*range(10)]
#Less efficient
lst = list(range(10))
Specifying runs and loops – After the %timeit
command, you can pass in as arguments your desired runs and loops using -r and -n respectively.
%timeit -r 5 -n 1000 list=['sue','liz','joe']
###Result
42 ns ± 0.458 ns per loop (mean ± std. dev. of 5 runs, 1000 loops each)
2. %%timeit
This command is prefixed by two percentage signs. It measures the average time taken to execute a cell block that contains multiple code lines.
%%timeit Syntax: The command is written at the start of a cell block, immediately followed by the lines of code to be timed beneath it.
%%timeit
code_line_1
code_line_2
...
Task 2: Timing multiple lines of code (a cell block) – Compare a for-loop VS a list comprehension that squares all the numbers from 0 to 1000.
For-loop – Below, we use %%timeit
and pass in our desired number of runs (5) and loops per run (1000).
%%timeit -r 5 -n 1000
squared_list=[]
for num in range(1000):
num_squared=num**2
squared_list.append(num_squared)
###Result
198 µs ± 9.31 µs per loop (mean ±
std. dev. of 5 runs, 1000 loops each)
The code takes 198 microseconds to execute.
List comprehension – Here we use %timeit
with one percentage sign because we’re measuring only one line of code.
%timeit -r 5 -n 1000 squared_list=[num**2 for num in range(1000)]
###Result
173 µs ± 7.22 µs per loop (mean ±
std. dev. of 5 runs, 1000 loops each)
The list comprehension code is faster at 173 microseconds.
Therefore, whenever possible, and if it does not compromise readability, use a list comprehension over a for-loop.
%lprun – Line profiling
This command comes from the line-profiler library, which outlines the time performance of a python function, program, or script.
It checks how long each line of code in the function takes and returns an output of the line-by-line analysis.
%lprun syntax: the command is followed by -f which means that we are analyzing a function. You then pass in the function name, then the function call with its parameters.
%lprun -f function_name function_name(args)
The line profiler is not built in python and needs to be installed the first time you use it in your system. You also need to load it into the ipython session every time you are running a new kernel.
!pip install line_profiler
%load_ext line_profiler
The table returned is an analysis of each line in the function, with the following columns:
- Line number: the position of the line in the function.
- Hits: Number of times the line was executed.
- Time: Total time taken by the line. The timer unit is specified at the top of the table.
- Per Hit: The average time it took to execute a line (Time/Hits).
- % Time: The percentage of time taken per line compared to other lines.
- Line contents: The actual source code of the line.
Task 3: Timing a Function – compare a for-loop VS a built-in python function for removing duplicates from a list.
In this example, both functions take in a list, remove duplicates, and return a list of unique items.
Using a for-loop
def remove_dups1(lst):
uniques = []
for name in lst:
if name not in uniques:
uniques.append(name)
return uniques
%lprun -f remove_dups1 remove_dups1(lst)

The timer unit is in seconds (1e-07 s), which equals 0.1 microseconds as per the table below. The whole function ran for 14.6 microseconds, and the for-loop codes were run severally (many hits).

_Using the set() function_
def remove_dups2(lst):
return list(set(lst))
%lprun -f remove_dups2 remove_dups2(lst)

This function only had one code line, which was run once (1 hit). The whole function ran for 3.3 microseconds.
For this reason, whenever possible, utilize the built-in functions that perform the task you desire. This is because they are optimized for their operations. Here is a list of built-in python functions that you can leverage in your code.
%mprun – Memory profiling
This command comes from the memory profiler library, which outlines the memory usage of a function.
Therefore, where %lprun
measures the time, %mprun
measures the memory consumed and returns a line-by-line analysis of the memory resources.
However, with %mprun
, the function needs to be saved in a separate python file. This file can be saved in your current working directory, which you then import into your session, and run the command on it. We’ll do all this soon.
Again, you need to install the memory profiler library into your system, then load it into the current kernel session.
!pip install memory_profiler
%load_ext memory_profiler
%mprun syntax: the command is followed by -f, then the function name, and finally the function call.
from my_file import func_name
%mprun -f func_name func_name(params)
The table returned contains the following information for every code line:
- Line #: The line number being executed.
- Mem usage: the memory used by the Python interpreter after executing this line of code, measured in bytes.
- Increment: the difference in memory used from the previous line. Think of it as the impact of this line on memory.
- Occurrences: the number of instances of items of the same type created in this line.
- Line Contents: the source code on that line.
Task 4: Timing a function in a Pandas DataFrame – What is the most efficient way to perform calculations on a Pandas column?
In our example below, we’ll work with a Pandas Dataframe and perform some calculations on a column. I am using the Kaggle dataset ‘fuel consumption’ available here under the Open Database license.
First, import the Pandas library, then load the dataset into the current session. Be sure to install the Pandas Library first if the code returns a module not found error.
import pandas as pd
data = pd.read_csv('Fuel_Consumption_2000-2022.csv')

The function takes in a Pandas dataframe, multiplies a column’s values by a scalar, and returns a modified dataframe. We will test four functions to check the most memory-efficient approach.
Remember that %mprun
must access the function from a file. To save the functions in one file, run the cell block below where the top line is %%file your_file.py.
This creates and writes (or overwrites) the contents into your_file.py
.
%%file my_file.py
def calc_apply(df):
column = df['COMB (mpg)']
new_vals = column.apply(lambda x: x* 0.425)
df['kml'] = new_vals
return df
def calc_listcomp(df):
column = df['COMB (mpg)']
new_vals = [x*0.425 for x in column]
df['kml'] = new_vals
return df
def calc_direct(df):
column = df['COMB (mpg)']
new_vals = column*0.425
df['kml'] = new_vals
return df
def calc_numpy(df):
column = df['COMB (mpg)'].values
new_vals = column*0.425
df['kml'] = pd.Series(new_vals)
return df
Next, load the memory profiler extension and import your functions from the file.
%load_ext memory_profiler
from my_file import calc_apply, calc_listcomp,
calc_direct, calc_numpy
Method 1: Using apply with a lambda function
%mprun -f calc_apply calc_apply(data.copy())

The apply
function line where the multiplication happens results in 45,000 occurrences and a 1.6 MB memory increment.
Method 2: Using a list comprehension
%mprun -f calc_listcomp calc_listcomp(data.copy())

Using a list comprehension halves the number of occurrences to around 22,500. However, a similar memory increase of 1.7 MB is noted for the two lines.
Method 3: Direct multiplication.
%mprun -f calc_direct calc_direct(data.copy())

Using the direct multiplication method results in only one occurrence of this item in memory and a very little memory increment of 0.4 MB.
Method 4: Using NumPy by first calling Series.values
to convert the column into a NumPy array.

The fourth method involves first converting the column into a NumPy array and then multiplying it by the scalar value. As in the previous method 3, there is only one occurrence of the item in memory and a similar memory increase of 0.4 MB.
Speed of direct multiplication VS NumPy multiplication.
Numpy calculation is faster, even though it consumes the same memory as the direct method. See the results of the two functions using %lprun
which measures the time taken per line.
Direct multiplication – Slower
%lprun -f calc_direct calc_direct(data.copy())

NumPy’s calculation – Faster
%lprun -f calc_numpy calc_numpy(data.copy())

The NumPy calculation (the column is first converted into a NumPy array using Series.values
) is faster, taking only 137 ms compared to 1,150 ms for the direct multiplication. The percentage time is also much less at 9.7% compared to 45% with direct multiplication.
For this reason, numeric calculations in a dataframe are most efficient using NumPy, as it is optimized for element-wise operations.
Conclusion
In this article, we discussed the importance of writing efficient and optimized code in python. We looked at different code examples and identified the most efficient coding approach.
We explored four magic commands; %timeit
, %%timeit
, %lprun,
and %mprun
. The first three test the time taken to execute code, while the last one measures the memory consumed. We also learned that line magics operate on one line of code and are prefixed by one %. On the other hand, cell magics are prefixed by two %% and they operate on multiple code lines directly beneath it.
I hope you enjoyed the article. To receive more like these whenever I publish a new one, subscribe here. If you are not yet a medium member and would like to support me as a writer, follow this link to subscribe for $5 and I will earn a small commission. Thank you for reading!
References