The world’s leading publication for data science, AI, and ML professionals.

12 Examples to Master Python Sets

A comprehensive practical guide for learning sets

Photo by Heng Films on Unsplash
Photo by Heng Films on Unsplash

A set is an unordered collection of distinct hashable objects. This is the definition of a set in the official Python documentation. Let’s open it up.

  • Unordered collection: It contains zero or more elements. There is no order associated with the elements of a set. Thus, it does not support indexing or slicing like we do with lists.
  • Distinct hashable objects: A set contains unique elements. The hashable means immutable. Although sets are mutable, the elements of sets must be immutable.

We now have an idea about what a set is in Python. The following examples will clearly explain all the properties of sets and what we can do with them.

Let’s start.

1. Creating a set

A set is a collection which can be created by putting objects separated by a comma in curly braces.

a = {1, 4, 'foo'}
print(type(a))
<class 'set'>
print(a)
{1, 4, 'foo'}

Sets do not contain repeated elements so even if we try to add same elements more than once, the resulting set will contain unique elements.

a = {1, 4, 'foo', 4, 'foo'}
print(a)
{1, 4, 'foo'}

2. Creating an empty set

The set notation is similar to the dictionary notation in Python. The difference is that when creating dictionaries, we put key-value pairs inside curly braces instead of single items.

We need to keep that in mind when creating an empty dictionary. If we only use curly braces with nothing inside, Python thinks it is an empty dictionary. We can use the set function to create an empty set.

a = {}
print(type(a))
<class 'dict'>
b = set()
print(type(b))
<class 'set'>
c = set({})
print(type(c))
<class 'set'>

3. Creating sets with iterables

We can also create sets using other iterables (e.g. list, tuple, string). The iterable is passed to the set function.

lst = [1, 2, 3, 3, 4]
a = set(lst)
print(a)
{1, 2, 3, 4}

The set function returns a set of unique elements. As we can see in the example, the resulting set only contains one of the 3s in the list.

We need to make sure the iterable passed to the set function does not have unhashable (i.e. mutable) objects.

For instance, if we pass a list that contains a mutable object (e.g. list), the set function will raise a TypeError.

lst = [1, 2, 3, [1,4]]
a = set(lst)
print(a)
TypeError: unhashable type: 'list'

4. Creating sets with strings

A string is also an iterable but I wanted to a separate example on creating sets based on strings.

We iterate over a string based on characters. Thus, the set function will return a set of unique characters in the strings. The characters will not be in the same order as they are in the string.

text = "let's create a set"
a = set(text)
print(a)
{'r', 's', "'", ' ', 'e', 't', 'l', 'c', 'a'}

5. Adding new items in a set

It is pretty simple. We can use the add method to add new immutable elements.

a = set()
a.add(3)
a.add('foo')
print(a)
{'foo', 3}

6. Removing items from a set

We have different options to remove items from a set. Both "remove" and "discard" methods remove an item from a set.

a = {1,2,3}
a.remove(1)
a.discard(2)
print(a)
{3}

The difference between "remove" and "discard" is observed when we try to remove an item that is not in the set. Remove will raise an error but nothing happens with discard.

#remove
a = {1,2,3}
a.remove(5)
KeyError: 5
#discard
a = {1,2,3}
a.discard(5)
print(a)
{1,2,3}

Another option is to use the "pop" method which removes an item from a set randomly. Unlike "remove" and "discard", pop method returns the item that has been removed.

a = {1,2,3,5,9,8}
print(a.pop())
print(a)
1
{2, 3, 5, 8, 9}

Finally, the "clear" method removes all items so that we will have an empty set.

a = {1,2,3,5,9,8}
a.clear()
print(a)
set()

7. Updating a set

The update method can be used to update the items in a set with the items in another iterable.

We can use another set for updating.

a = {1, 2, 3}
b = {1, 6, 7, 5}
a.update(b)
print(a)
{1, 2, 3, 5, 6, 7}

Set a is updated with the items in set b that are not already in set a.

We can pass other iterables to the update method such as list or string. The update methods accepts multiple arguments.

a = {1, 2, 3}
lst = [8, 9]
text = "string"
a.update(lst, text)
print(a)
{1, 2, 3, 't', 'i', 8, 9, 's', 'r', 'g', 'n'}

We can also use the "|=" operator to update a set but the iterable used for updating must be another set (not just any iterable).

a = {1, 2, 3}
b = {5, 6}
a |= b
print(a)
{1, 2, 3, 5, 6}

8. Comparison between sets

We can compare sets in terms of their difference and intersection. We can also combine sets by taking their union.

The following figure illustrates what these operations imply.

Set operations (image by author)
Set operations (image by author)

Let’s do a few examples that support the figure above.

A = {1, 2, 3}
B = {1, 4, 5}
print(A.difference(B))
{2, 3}
print(B.difference(A))
{4, 5}
print(A.intersection(B))
{1}
print(A.union(B))
{1, 2, 3, 4, 5}

Another useful method is "symmetric_difference" which combines A.difference(B) and B.difference(A). Thus, it will return the items that are not in both sets.

print(A.symmetric_difference(B))
{2, 3, 4, 5}

9. Updating based on comparison

The operations we did in the previous step return a set based on the specified operation (e.g. union, intersection). However, they do not change the original lists. For instance, sets A and B remained the same after the operations.

If we want to update the sets based on these comparisons and operations, we combine these methods with update as follows.

A = {1, 2, 3}
B = {1, 4, 5}
A.intersection_update(B)
print(A)
{1}
print(B)
{1, 4, 5}

The intersection_update method does not return a set but updates set A in place. The set B does not change.

Note: "A &= B" is equal to the A.intersection_update(B).

The same syntax is valid for other comparisons as well. Let’s do another example.

A = {1, 2, 3}
B = {1, 4, 5}
B.difference_update(A)
print(B)
{4, 5}

Note: "A -= B" is equal to the A.difference_update(B).

Note: "A ^= B" is equal to the A.symmetric_difference_update(B).


10. Frozenset

The frozensets are just like sets with only one difference. They are immutable. We cannot do any operation on a frozenset that changes it such add, remove, update and so on.

A = {1, 2, 3}
B = frozenset(A)
B.add(5)
AttributeError: 'frozenset' object has no attribute 'add'

11. Superset and subset

  • A is a superset of B if all elements in B are also in A.
  • B is a subset of A if all elements in B are also in A.

Thus, if A is a superset of B, then B is a subset of A. The issubset and issuperset methods are used to do these comparisons.

A = {1, 2, 3, 4}
B = {1, 2}
C = {1, 2, 8}
print(A.issuperset(B))
True
print(B.issubset(A))
True
print(A.issuperset(C))
False

12. Disjoint sets

Two sets are said to be disjoint if they do not have any common elements. The method to use is "isdisjoint".

A = {1, 2, 3, 4}
B = {1, 2}
C = {10, 11}
print(A.isdisjoint(B))
False
print(A.isdisjoint(C))
True

Bonus

Since sets are mutable, we need to be careful when copying them. If we copy a set using "=" operator, we create a reference to the existing set instead of creating a new one. We can use the "copy" or "set" functions to create a new object.

Let’s do an example.

A = {1, 2, 3}
B = A
C = A.copy()
D = set(A)
print(B is A)
True
print(C is A)
False
print(D is A)
False

As you can see, both A and B refers to the same object in memory. Thus, any change in A will also affect B.

We can also check the identities of these objects.

print(id(A))
print(id(B))
print(id(C))
140229247448008
140229247448008
140229272419912

A and B have the same ID and C has a different one.


Set is an important data structure. A common use case of sets is to remove duplicate elements from a sequence.

They can also be used to perform common math operations such as union, intersection and so on.

Thank you for reading. Please let me know if you have any feedback.


Related Articles