The world’s leading publication for data science, AI, and ML professionals.

4 Must-Know Methods for Python Set Comparison

A practical guide

(image by author)
(image by author)

Set is one of the 4 built-in data structures in Python. The other ones are dictionary, list, and tuple.

According to the official documentation, a set is an unordered collection of distinct hashable objects. Thus, two characteristic features of sets are:

  • They do not contain duplicate elements
  • The elements must be hashable (i.e. immutable). Although sets are mutable, the elements in a set are immutable.

In this article, we will go over 4 must-know methods for comparing sets. Let’s start with creating a set object.

myset = {1, 2, 3, 4, 5}
type(myset)
# Output
set

The methods we will cover are:

  • Intersection
  • Difference
  • Union
  • Symmetric difference

Difference and symmetric difference

The difference method finds the elements that exist in one set but not in the others.

(image by author)
(image by author)

As shown in the drawing above, the difference of set A from set B includes all the elements that exist in set A but do not exist in set B.

Let’s do some examples to see how these methods are used.

A = {1, 2, 3, 4, 5}
B = {3, 4, 5, 6, 7}

A.difference(B)
# Output
{1, 2}

B.difference(A)
# Output
{6, 7}

Set A contains 1 and 2 whereas set B doesn’t. Thus, A difference B returns 1 and 2.

The difference method can be used on more than two sets. For instance, "A.difference(B, C)" finds the elements that exist in A but not in B or C. Let’s see it in action:

A = {1, 2, 3, 4}
B = {3, 4, 10, 11}
C = {2, 4, 20, 21}
A.difference(B, C)
# Output
{1}

The symmetric difference method finds the elements that exist in only set A or only set B. Thus, it returns the union of "A difference B" and "B difference A". Here is how this method is used:

A.symmetric_difference(B)
# Output
{1, 2, 6, 7}

Since the symmetric difference covers both "A difference B" and "B difference A", we can switch the positions of A and B:

(image by author)
(image by author)

Intersection and union

Intersection of two sets contains the elements that exist in both sets. The union contains the elements that exist in any of the sets.

(image by author)
(image by author)
A = {1, 2, 3, 4, 5}
B = {3, 4, 5, 6, 7}

A.intersection(B)
# Output
{3, 4, 5}

Note: "A.intersection(B)" is the same as "B.intersection(A)".

Let’s find the union of A and B as well:

A.union(B)
# Output
{1, 2, 3, 4, 5, 6, 7}

The union method takes the union of the sets. Although 3, 4, and 5 exist in both sets, the resulting set contains only one for each. The duplicates are removed.

The union and intersection methods can be used with more than two sets.

A = {1, 2, 3, 4}
B = {3, 4, 10, 11}
C = {2, 4, 20, 21}

A.union(B, C)
# Output
{1, 2, 3, 4, 10, 11, 20, 21}

A.intersection(B, C)
# Output
{4}

We have covered 4 different methods that are used for comparing two or more sets. If you’d like to learn more about sets, here is a more detailed article I previously wrote:

12 Examples to Master Python Sets


You can become a Medium member to unlock full access to my writing, plus the rest of Medium. If you already are, don’t forget to subscribe if you’d like to get an email whenever I publish a new article.

Join Medium with my referral link – Soner Yıldırım

Thank you for reading. Please let me know if you have any feedback.


Related Articles