What’s the difference between loc[]and iloc[] in Python and Pandas

Introduction
Indexing and slicing pandas DataFrames and Python may sometimes be tricky. The two most commonly used properties when it comes to slicing are iloc
and loc
.
In today’s article we are going to discuss the difference between these two properties. We’ll also go through a couple of examples to make sure you understand when to use one over the other.
First, let’s create a pandas DataFrame that we’ll use as an example to demonstrate a few concepts.
import pandas as pd
df = pd.DataFrame(
index=[4, 6, 2, 1],
columns=['a', 'b', 'c'],
data=[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]],
)
print(df)
# a b c
# 4 1 2 3
# 6 4 5 6
# 2 7 8 9
# 1 10 11 12
Slicing using loc[]
loc[]
property is used to slice a pandas DataFrame or Series and access row(s) and column(s) by label. This means that the input label(s) will correspond to the indices of rows that should be returned.
Therefore, if we pass an integer to loc[]
it will be interpreted as the label of the index and not as the positional index. In the example shown below, loc
will return the row with index label equal to 1
.
>>> df.loc[1]
a 10
b 11
c 12
Name: 1, dtype: int64
loc
also accepts an array of labels:
>>> df.loc[[6, 2]]
a b c
6 4 5 6
2 7 8 9
Similarly, we can also use a slice object to retrieve specific range of labels. In the example below, notice how the slicing is computed; 4:2
does not correspond to indices but instead, to labels. In other words, it tells pandas to return all the rows in between the indices 4
and 2
.
>>> df.loc[4:2]
a b c
4 1 2 3
6 4 5 6
2 7 8 9
Slicing using iloc[]
On the other hand, iloc
property offers integer-location based indexing where the position is used to retrieve the requested rows.
Therefore, whenever we pass an integer to iloc
you should expect to retrieve the row with the corresponding positional index. In the example below, iloc[1]
will return the row in position 1
(i.e. the second row):
>>> df.iloc[1]
a 4
b 5
c 6
Name: 6, dtype: int64
# Recall the difference between loc[1]
>>> df.loc[1]
a 10
b 11
c 12
Name: 1, dtype: int64
Again, you can even pass an array of positional indices to retrieve a subset of the original DataFrame. For example,
>>> df.iloc[[0, 2]]
a b c
4 1 2 3
2 7 8 9
Or even a slice object of integers:
>>> df.iloc[1:3]
a b c
6 4 5 6
2 7 8 9
iloc
can also accept a callable function that accepts a single argument of type pd.Series
or pd.DataFrame
and returns an output which is valid for indexing.
For instance, in order to retrieve only the rows with odd index a simple lambda function should do the trick:
>>> df.iloc[lambda x: x.index % 2 != 0]
a b c
1 10 11 12
Finally, you can also use iloc
to index both axes. For example, in order to fetch the first two records and discard the last column you should call
>>> df.iloc[:2, :2]
a b
4 1 2
6 4 5
Final Thoughts
In this article we discussed how to properly index slice pandas DataFrames (or Series) using two of the most commonly properties namely loc
and iloc
.
It’s very important to understand the differences between these two properties and be able to use them effectively in order to create the desired output for your specific use-case. loc
is used to index a pandas DataFrame or Series using labels. On the other hand, iloc
can be used to retrieve records based on their positional index.
You may also like