 from Pixabay](https://towardsdatascience.com/wp-content/uploads/2021/08/1R_oW7RroQB6Fy5cSFjX_JA.jpeg)
Every Programming language will have its particular mechanisms for garbage collection. This is referring to those unused variables which still occupy some space in the memory that will be eventually removed. This is important in terms of utilising the memory space more efficiently.
Have you ever thought about how the Python garbage collection works? Particularly, how does Python know an object becomes unuseful? In this article, I’ll demonstrate this mechanism. Some built-in functions will be utilised such as the id()
and getrefcount()
.
Show Memory Address

Before we can continue with the garbage collection mechanism, it is necessary to build the concept of memory addresses. Don’t worry, it doesn’t have to be deep dive. I’ll demonstrate using the id()
function, and that will be enough.
Firstly, let’s define two Python lists. They can be exactly the same in terms of the content.
a = [1, 2, 3]
b = [1, 2, 3]

Apparently, variables a and b are the same. However, does that mean these two variables point to the same memory address? No. Let’s verify it.
id(a)
id(b)

The id()
function will give us the "identity" of an object, which is indicated by an integer. As shown, the integers are different. So, variables a
and b
are pointing to different memory addresses, although they are the same at the moment.
If we create another variable a1
and let a1 = a
, there is no new object created. Instead, a1
will point to the same memory address as a
.

That makes sense, so when we change a
, a1
will also be updated.

The Reference Count

Now, we can come to the most important concept – reference count.
The reference count in Python indicates the number of references to a certain object. This is important because the garbage collection mechanism relies on it to decide whether the object should be retained or released in memory.
That is, when an object’s reference count equals zero, it will be released. Very intuitively and reasonably, when there is no reference to an object, it means that the object is abandoned and useless.
How can we get the reference count, then? In fact, it could be designed as an internal mechanism that doesn’t simply reveal to the developer. However, Python actually provided a built-in function called getrefcount()
in the sys
module that can easily query the reference count of an object.
To use this function, we need to import it from the sys
module. This is built-in to any version of Python 3, so you don’t need to download or install anything to be able to use it.
from sys import getrefcount
Then, let’s use this function to query the reference count.
a = [1, 2, 3]
print(getrefcount(a))

In this example, I have created a variable a
and assign a simple Python list to it. Then, the getrefcount()
function shows that the reference count of this object is 2.
But hold on, why is it 2? Please have a look at the graph below.

In fact, when we use the getrefcount()
function to query the reference count of an object, the function has to establish the reference to the object. That’s why the reference count is 2. It indicates that both the variable a
and the function getrefcount()
are referencing the list [1, 2, 3]
.
What will increase the reference count?

Now that we have understood the reference count and how to query it for an object, what will cause the reference count to change? The following actions will increase the reference count by 1.
1. The object is created and assigned to a variable.
This has been demonstrated in the previous section already. When we created the Python list object [1, 2, 3]
and assign it to the variable a
, the reference count of the list object [1, 2, 3]
was set to 1.
2. The object is assigned to one more variable.
When the object is assigned to another variable, the reference count will be added by 1. However, please be careful that this doesn’t mean the following.
a = [1, 2, 3]
b = [1, 2, 3] # This will NOT increase the reference count
This has been discussed in section 1. Although the lists are the same, they are different objects. To increase the reference count, we can do the following.
a = [1, 2, 3]
b = a

3. The object is passed in a function as an argument.
This is exactly the case when we use the function getrefcount(a)
. The variable a
was passed into the function as an argument so that it will definitely be referenced.
4. An object is appended into a container type.
A container type can be a list, a dictionary or a tuple, such as the following example.
my_list = [a]

What will reduce the reference count?

Now, let’s have a look at the scenarios that will reduce the reference count.
1. The object has been removed from the scope of a function. This usually happens when a function finishes the execution.
We can verify this if we try to print the reference count while executing a function. So, we can design the experiment as follows.
def my_func(var):
print('Function executing: ', getrefcount(var))
my_func(a)
print('Function executed', getrefcount(a))

But why is the reference count 4 rather than 3? This involves another Python concept, the "Call Stack."
When a function is called in Python, a new frame is pushed onto the call stack for its local execution, and every time a function call returns, its frame is popped off the call stack.
This concept will not be expanded in this article because it is out of scope. If you are not familiar with the call stack, what I can tell is that the error message you’ve seen with the traceback and the line number is exactly from the call stack.

Therefore, the reference count is 4 during my_func()
was executing. After it had been executed, the reference count was reduced back to 2.
2. When a variable that references the object is deleted.
This is very easy to understand. When we use the del
command to delete the variable, the variable will no longer reference to the object.

Please note that if we delete the variable a
in this case, the reference count of the object will become 0. That is exactly the scenario in which garbage collection will release this object. However, that also means we can no longer use the getrefcount()
function to check the reference count.
3. When a variable that references the object is assigned to another object.
This case will probably happen more often. When a variable is assigned to another object, the reference count of the current object will be reduced by one. Of course, the reference count of the new object will be increased.

4. When the object is removed from a container.
When an object is appended to a container, its reference count is +1. On the other hand, when it is removed, the reference count is -1.

Of course, if we delete the container, the reference count will also be reduced.

A Special Case
Please note that only the general objects can be investigated in this way. That means we do have special cases when the value is a literal constant, such as a number 123
or a string 'abc'
.

The reference count can be unexpected, as shown above. In my case, I’m using Google Colab to share the environment, which caused such large reference counts.
Another important factor worth mentioning is that the literal constants are guaranteed to be at the same memory location.

Therefore, as long as the number 123
is used somewhere, the reference count could be increased. Even though we have only 1 variable referencing it, the reference count might also be more.
Summary

In this article, I have introduced the garbage collection mechanism in Python. That is the reference count of the objects.
The following actions will increase the reference count of an object:
- The object is created and assigned to a variable.
- The object is assigned to one more variable.
- The object is passed in a function as an argument.
- An object is appended into a container type.
On the contrary, the following actions will reduce the reference count of an object:
- The object has been removed from the scope of a function. This usually happens when a function finishes the execution.
- When a variable that references the object is deleted.
- When a variable that references the object is assigned to another object.
- When the object is removed from a container.