The world’s leading publication for data science, AI, and ML professionals.

How To Suppress SettingWithCopyWarning in Pandas

Understanding the difference between copies and views in pandas and how to deal with SettingWithCopyWarning

Photo by Sandy Ravaloniaina on Unsplash
Photo by Sandy Ravaloniaina on Unsplash

Introduction

SettingWithCopyWarning is certainly among the most common issues pandas newcomers run into. This article explains why the warning is thrown in the first place and discusses how to suppress it. Additionally, we’ll also discuss a few tips and best-practices in order to avoid getting this warning message.

Even though SettingWithCopyWarning is still a warning you must ensure you understand precisely why it is being raised in the first place to avoid unexpected behaviour.


First, let’s create a dummy dataset that we’ll use throughout this post.

import numpy as np
import pandas as pd
# Set random seed so that results are reproducible
np.random.seed(0)
df = pd.DataFrame(
    np.random.choice(100, (3, 4)), 
    columns=list('ABCD')
)
print(df)
#     A   B   C   D
# 0  39  87  46  88
# 1  81  37  25  77
# 2  72   9  20  80

What is the SettingWithCopyWarning

Before discussing how to suppress SettingWithCopyWarning it’d be helpful to first understand what the warning is about as well as what it triggers it.

SettingWithCopyWarning is a warning which means that your code may still be functional. However, it’s important not to ignore it but instead, understand why it has been raised in the first place. This way it will be much easier for you to adjust your code accordingly so that the warning is no longer raised.

Views and copies in pandas

When you perform filtering operations over pandas DataFrames, the result may be a view or a copy of the DataFrame itself depending on some implementation details related to the structure of the df.

Views share the underlying data with the original DataFrame and thus when you modify a view you may also modify the original object. Copies are independent (subsets of) replicas of the original DataFrames and therefore, any changes made on a copy it has no effect on the original object.

Source: Author
Source: Author

To demonstrate the difference between copies and views, let’s consider the example below.

>>> df_slice = df.iloc[:3, :3]
>>> df_slice
    A   B   C
0  44  47  64
1  67   9  83
2  36  87  70

This simple slicing returns a view, which means that changes in the original df will reflect to df_slice and vice-versa.

>>> df_slice.iloc[1, 1] = 1
>>>
>>> df_slice
    A   B   C
0  44  47  64
1  67   1  83
2  36  87  70
>>>
>>> df
    A   B   C   D
0  44  47  64  67
1  67   1  83  21
2  36  87  70  88
>>>
>>> df.iloc[1, 1] = -1
>>>
>>> df_slice
    A   B   C
0  44  47  64
1  67  -1  83
2  36  87  70

On the other hand, operations over copies won’t have any effect on the original DataFrame. For example, the operation below will return a copy instead of a view

>>> df.loc[df.A > 5, 'B']

Now if you are applying operations over a copies there is a certain chance you may come across the warning below:

__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

The message simply warns the users that they are operating on a copy and not the original object itself. In the following section we will discuss the problem of chained assignment or indexing that triggers this particular warning.


Chained assignment

As mentioned above, SettingWithCopyWarning indicates potential chained assignments. First let’s define a few terms in order to ensure we all speak the same language.

  • Assignment is an operation that assigns (or sets) values
  • Access is an operation that returns (or gets) the values. For instance, when we index a DataFrame we pretty much access it.
  • Indexing is an operation that assigns or accesses values and potentially references only a subset of the original data (for example, a subset of columns and/or rows)
  • Chaining occurs when we perform multiple indexing operations in a back-to-back fashion. For instance, df[1:][1:5]is a chaining operation.

Therefore, chained assignments are defined to be the combination of chained and assignment operations. To illustrate this type of operation let’s consider an example where we want to assign the value -1 to column B for every record that has A = 8:

>>> df[df.A == 44]['B'] = 100

Normally, the above operation should trigger the warning:

__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Now if we print out the original df the above operation had no effect at all:

>>> df
    A   B   C   D
0  44  71  70  53
1  32  64  44  67
2  38   2  98  50

So in the above example, the warning was raised because we chained two operations together:

  • df[df.A == 44]
  • ['B'] = -1

Both operations will be executed sequentially one after the other in an independent context. The first operation is an access operation, that is a get operation that returns a DataFrame based on the filter condition such that the value of column A is equal to the numerical value 44. The second operation is an assignment operation that sets specific values on the copy of the original DataFrame.

In the next coming sections we will discuss a few ways you can use in order to make this kind of operations safer in a way that the warning is also suppressed.


How to suppress SettingWithCopyWarning

In this section we are going to discuss the following workarounds that can be used to fix your code so that SettingWithCopyWarning is not being raised at all.

  • How to use loc[] to slice subsets in a way that SettingWithCopyWarning is not raised
  • Take a deep copy of the original DataFrame before performing the assignment operation
  • Disable the check for chained assignments so that SettingWithCopyWarning is no longer raised

Using loc for slicing

Now if we carefully inspect the raised warning we’ll notice that it also comes with a suggestion:

__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

One approach that can be used to suppress SettingWithCopyWarning is to perform the chained operations into just a single loc operation. This will ensure that the assignment happens on the original DataFrame instead of a copy. Therefore, if we attempt doing so the warning should no longer be raised.

To illustrate how loc can be used to suppress SettingWithCopyWarning let’s consider once again the example of the previous section.

>>> df[df.A == 44]['B'] = 100

The above statement could be re-written as

>>> df.loc[df.A == 44, 'B'] = 100

You’ll now notice that the warning is no longer being raised and that the assignment operation this time has an effect on the original DataFrame:

>>> df
    A    B   C   D
0  44  100  70  53
1  32   64  44  67
2  38    2  98  50

In general, you need to make sure to use loc for label indexing and iloc for integer or positional indexing as it is guaranteed that they operate on the original object. For more details about the difference between loc and iloc and how to use them make sure to read the article below.

loc vs iloc in Pandas


Using deep copies

A different approach requires us to take a deep copy of the original DataFrame before attempting to perform changed assignment.

First, let’s showcase that even when splitting the chained operation into two statements the problem will still occur;

>>> df_2 = df[df.A == 44]
>>> df_2['B'] = 100
__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Now another workaround is to create a deep copy from the first slicing operation using the copy() method.

>>> df_2 = df[df.A == 44].copy(deep=True)
>>> df_2['B'] = 100
>>>

Voila! The warning is not raised.

>>> df_2
    A    B   C   D
0  44  100  64  67

It’s also important to understand the underlying differences between shallow and deep copies of Python objects. If you want to learn more make sure to give the article below a read.

What’s the Difference Between Shallow and Deep Copies in Python?


Ignoring the warning

SettingWithCopyWarning essentially warns users that an operation may have been performed on a copy rather than the original object. However, there are also false positives which means that the warning may not be accurate. In this case, you can just disable the check and the warning will no longer be raised. If you are new to pandas make sure that this will be your last resort.

For example, the below slicing will generate a copy. You can check whether the generated DataFrame using ._is_view attribute.

>>> df_copy = df.loc[df.A > 32, 'B']
>>> df_copy
0    5
1    4
Name: B, dtype: int64
>>>
>>> df_copy._is_view
False

Now if we attempt to do

>>> df_2 = df[['A']]
>>> df_2['A'] += 2

You’ll see the warning

__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

even though the result would be correct:

>>> df_2
    A
0  26
1   3
2  99
# the operation before had no effect on the original df
>>> df
    A   B   C   D
0  24  26  33  82
1   1  82  86  64
2  97  32   4  77

In such cases you can set pd.options.mode.chained_assignment to None :

>>> pd.options.mode.chained_assignment = None
>>> df_2['A'] += 2
>>>

Note: Don’t forget to set it back to warn which is the default value. The options are None, 'warn', or 'raise'.


Final Thoughts

In today’s article we discussed what SettingWithCopyWarning and when is being raised in the first place. We’ve seen in practice the difference between a copy and a view of a pandas DataFrame or Series and how this can trigger SettingWithCopyWarning under specific conditions.

Even though SettingWithCopyWarning is still a warning that may not cause your Python code to fail, you must ensure that you understand why it is raised and try to adjust the code using the techniques we discussed earlier. In some rare occasions the warning may not really affect your results. If you are sure you understand that it won’t cause you any troubles then you can even disable the check by setting the configuration we’ve seen in the last section of the article.


Next Steps

In this article we explored a lot of concepts such as indexing, slicing, copying etc. The articles below discuss these concepts in more depth so make sure to give them a read in order to ensure that you can follow all the concepts explained in this post.

Mastering Indexing and Slicing in Python

Dynamic Typing in Python


Related Articles