Every Python developer knows about dictionary objects in Python. However, there is another dictionary-like object, called a defaultdict object, which offers additional functionality when compared to regular dictionary objects.
Let’s introduce this defaultdict object via an example.
counting example
Let’s say that we want to count the number of occurrences of each letter in a word. We will create a dictionary with the keys as the letters and their values as their number of occurrences.
We will use the longest word in most English dictionaries (according to google):
word = 'pneumonoultramicroscopicsilicovolcanoconiosis'
We can accomplish this using the dictionary object we’re all familiar with as follows:
letter_count = dict()
for letter in word:
if letter not in letter_count:
letter_count[letter] = 1
else:
letter_count[letter] += 1
print(letter_count)
Output:
{'p': 2,
'n': 4,
'e': 1,
'u': 2,
'm': 2,
'o': 9,
'l': 3,
't': 1,
'r': 2,
'a': 2,
'i': 6,
'c': 6,
's': 4,
'v': 1}
We first create an empty dictionary object using the dict() constructor. We then loop over every letter in the word using a for loop. As we loop over every letter in the word, we check if the letter is already a key in the dictionary. If not, we set the key as the letter, and its value to 1 (since that’s the first time that letter has appeared during our loop). If the letter already exists as a key, we add 1 to its value.
Even though a regular dictionary object worked, there are two issues here.
First, notice how we first had to check if the key already exists in the dictionary. Otherwise, we would have a KeyError.
for letter in word:
letter_count[letter] += 1
# KeyError
Second, if we try to check for the count of letters not present in the dictionary, we would also get a KeyError.
letter_count['p']
# 2
letter_count['z']
# KeyError
Sure, we can use the dictionary get() method to solve the first issue, however the second issue would still remain:
for letter in word:
letter_count_2[letter] = letter_count_2.get(letter, 0) + 1
letter_count_2['p']
# 2
letter_count_2['z']
# KeyError
The get method takes in two arguments: the key’s value we want to retrieve, and the value we want to assign to that key if the key does not exist.
defaultdict object
This is a great example when using a defaultdict object would be helpful. The defaultdict class is a subclass of the built-in dict class, which means it inherits from it. Thus, it has all the same functionality as the dict class, however, it takes in one additional argument (as the first argument). This additional argument, _default_factory_, will be called to provide a value for a key that does not yet exist.
class
collections.**defaultdict
**(_defaultfactory=None, /[, …])
Let’s look at how defaultdict can be used to accomplish the above task.
We first have to import defaultdict from the collections module:
from collections import defaultdict
We then use the defaultdict() constructor to create a defaultdict object:
letter_count_3 = defaultdict(int)
Notice how we pass in the int function as an argument for the _default_factory parameter. This int_ function will be called without any arguments to provide a default value for a key if that key does not exist.
for letter in word:
letter_count_3[letter] += 1
So as we loop over the word, if that key does not exist (meaning we are encountering a letter for the first time), the int function is called without any arguments passed to it, and whatever it returns will be the value for that key. Calling the int function without passing in any arguments returns the integer 0. We then add 1 to the 0 that is returned.
This solves both of the issues we encountered above, since retrieving the value for any key that does not exist will return zero:
letter_count_3['p']
# 2
letter_count_3['z']
# 0
We could have also passed in a lambda function as the _default_factory_ argument that takes in no arguments and returns the value 0 as follows:
letter_count_3 = defaultdict(lambda: 0)
I hope you enjoyed this article on the defaultdict object in Python. Thank you for reading!