
Too often, beginners choose the familiar over the optimal.
Most Python courses introduce students to container data types by starting with Lists. Naturally, beginners become comfortable with them. When starting out, you focus on moving fast and tackling the next topic in order to learn as much as possible while the motivation lasts. As a result, you develop habits that work, but are not always best practice. This is especially true of lists due to their versatility.
Most people use lists when a set would have sufficed. It’s much rarer to see a set being used when a list is required. In those cases, the missing information typically produces an error. However, since lists contain more information than sets, using lists in lieu of sets usually doesn’t produce errors, making it hard to identify when a set should be used.
Here are 3 clues to help spot when sets make more sense than lists.
1. Order doesn’t matter
Unlike lists, sets do not store ordered data. Lists have indexed and accessible data, meaning each element is retrievable. There is no way to access an individual item in a set because they are not given an index.
However, not all data needs to be indexed. For example, membership testing. Testing whether or not an item belongs to a group does not require the group to be indexed because it’s only returning a True
or False
statement.
2. Duplicates are irrelevant
By definition, sets cannot contain duplicates. This makes sets the ideal container for storing unique items.
In fact, this attribute makes the set data type a popular method of removing duplicates from a container. For example, the code belows creates a set, countries
, to find all the countries that received Olympic medals for the men’s 100-meter sprint. Even though some countries appear in the dataset multiple times, countries
removes those repeated values and only returns the unique items.

To further refine the set, use set comprehension. Similar to list comprehension, set comprehension creates a set by reducing a for loop to one line of code. Set comprehensions allow for additional criteria. For instance, the set comprehension below, startswith
, finds all the countries that medaled and start with 'B'
.
3. Processing speed matters
Since sets do not store indexed data or duplicates, they use less memory than lists and are less computationally expensive. As a result, sets take less time to search through.
To test this, the code below performs two membership tests. In one case, the container is a list and in the other, it’s a set. Both containers are the same length, and both cases store the outputs in a set.
Storing the outputs of the list case in a list would take more time and introduce another variable, output type, into the experiment. The container type used to create the set comprehension is the only difference between these two cases.
The chart below illustrates the time difference between sets and lists in membership testing as container size increases.

Conclusion
As you can see, in many cases, sets offer advantages over lists. Hopefully, this article helped you identify when to use sets in order to avoid becoming over-dependent on lists.
Thank you for reading my article. If you enjoy my content, please consider following me. Also, all feedback is welcome. I am always eager to learn new or better ways of doing things. Feel free to leave a comment or reach out to me at [email protected].