The world’s leading publication for data science, AI, and ML professionals.

The defaultdict Object in Python

Learn about a better alternative to the Python dictionary object

Photo by David Schultz on Unsplash
Photo by David Schultz on Unsplash

Every Python developer knows about dictionary objects in Python. However, there is another dictionary-like object, called a defaultdict object, which offers additional functionality when compared to regular dictionary objects.

Let’s introduce this defaultdict object via an example.


counting example

Let’s say that we want to count the number of occurrences of each letter in a word. We will create a dictionary with the keys as the letters and their values as their number of occurrences.

We will use the longest word in most English dictionaries (according to google):

word = 'pneumonoultramicroscopicsilicovolcanoconiosis'

We can accomplish this using the dictionary object we’re all familiar with as follows:

letter_count = dict()
for letter in word:
    if letter not in letter_count:
        letter_count[letter] = 1
    else:
        letter_count[letter] += 1
print(letter_count)
Output:
{'p': 2,
 'n': 4,
 'e': 1,
 'u': 2,
 'm': 2,
 'o': 9,
 'l': 3,
 't': 1,
 'r': 2,
 'a': 2,
 'i': 6,
 'c': 6,
 's': 4,
 'v': 1}

We first create an empty dictionary object using the dict() constructor. We then loop over every letter in the word using a for loop. As we loop over every letter in the word, we check if the letter is already a key in the dictionary. If not, we set the key as the letter, and its value to 1 (since that’s the first time that letter has appeared during our loop). If the letter already exists as a key, we add 1 to its value.


Even though a regular dictionary object worked, there are two issues here.

First, notice how we first had to check if the key already exists in the dictionary. Otherwise, we would have a KeyError.

for letter in word:
    letter_count[letter] += 1
# KeyError

Second, if we try to check for the count of letters not present in the dictionary, we would also get a KeyError.

letter_count['p']
# 2
letter_count['z']
# KeyError

Sure, we can use the dictionary get() method to solve the first issue, however the second issue would still remain:

for letter in word:
    letter_count_2[letter] = letter_count_2.get(letter, 0) + 1
letter_count_2['p']
# 2
letter_count_2['z']
# KeyError

The get method takes in two arguments: the key’s value we want to retrieve, and the value we want to assign to that key if the key does not exist.


Sorting a Dictionary in Python


defaultdict object

This is a great example when using a defaultdict object would be helpful. The defaultdict class is a subclass of the built-in dict class, which means it inherits from it. Thus, it has all the same functionality as the dict class, however, it takes in one additional argument (as the first argument). This additional argument, _default_factory_, will be called to provide a value for a key that does not yet exist.

class collections.**defaultdict**(_defaultfactory=None, /[, ])

Let’s look at how defaultdict can be used to accomplish the above task.


We first have to import defaultdict from the collections module:

from collections import defaultdict

We then use the defaultdict() constructor to create a defaultdict object:

letter_count_3 = defaultdict(int)

Notice how we pass in the int function as an argument for the _default_factory parameter. This int_ function will be called without any arguments to provide a default value for a key if that key does not exist.

for letter in word:
    letter_count_3[letter] += 1

So as we loop over the word, if that key does not exist (meaning we are encountering a letter for the first time), the int function is called without any arguments passed to it, and whatever it returns will be the value for that key. Calling the int function without passing in any arguments returns the integer 0. We then add 1 to the 0 that is returned.

This solves both of the issues we encountered above, since retrieving the value for any key that does not exist will return zero:

letter_count_3['p']
# 2
letter_count_3['z']
# 0

We could have also passed in a lambda function as the _default_factory_ argument that takes in no arguments and returns the value 0 as follows:

letter_count_3 = defaultdict(lambda: 0)

I hope you enjoyed this article on the defaultdict object in Python. Thank you for reading!


Related Articles