Data Science

Too often on social media I see images similar to the one shown above. They include a short message stating these are the "commonly used Sql joins". The issue is that those posts typically omit anti-joins. I believe anti-joins are important to learn early in your career so you are aware of them for when you will inevitably need them. In this post I want to show you how to use anti-joins effectively to quickly get the data insights you need.
I will be using this SQLite sandbox database for my code examples: https://www.sql-practice.com/.
What is an Anti-Join?

You can see a visual representation of an anti-join in the image above. An anti-join is when you would like to keep all of the records in the original table except those records that match the other table. I have never used a right-anti join in my work, the left anti-join will typically work for your use cases. As a matter of fact, I don’t usually use right joins at all since you can accomplish the same thing as a left join just by switching the order of the tables.
Anti-Join Example Using Code
This is an example of a left anti-join. It is the exact same as a left join except for the WHERE
clause. This is what differentiates it from a typical left join.
The query above is finding all admissions that did not have a matching physician_id in the physicians table. By setting the physician_id column in the example above to null, it is finding all rows in the left table that did not have a matching record (a null value) in the table on the right.
In this particular case, this returns 0 rows since the tables have the same physicians.
When Should You Use an Anti-Join?
Describing the above query example should make the use case clear. In the above code, you can see the first CTE (common table expression) filters the physicians table to just general surgeons.
This table is then joined (using an anti-join) to the admissions table to find all admissions that didn’t involve general surgeon physicians. From this example, you can see this type of join is useful when you want to find all those records that don’t match between the two joining tables.
To see these non-matches for yourself, we can alter the above query to be just a left join and show all columns. You’ll see non-matches will have all null values for the columns for the table on the right.
Quick note: you can reference any column in the WHERE
clause, it doesn’t have to be the same as the columns you use to make the join.
For example, you could swap out the physician_id for the first_name column and get the same result for the query shown in antijoin_example2.sql.
That being said, it’s best practice to use the same column as the column in the join – I just thought I would point this out to help you understand what the join is doing.
When You Should Just Use the EXCEPT Operator Instead
If you use Oracle SQL this section still applies, just replace all references to the EXCEPT
operator with the MINUS
operator instead.
I can hear some people already asking, "But wait, the EXCEPT
operator already does this exact thing. Why shouldn’t I just use that operator instead?"
If you’re not sure what the EXCEPT
operator is, basically it takes a table and finds all records in the first table that aren’t in the second table. This is exactly the same purpose as an anti-join, but they are used in different scenarios:
- Use the
EXCEPT
when you only need the columns you are comparing between the two tables - Use the anti-join when you need more columns than what you would compare when using the
EXCEPT
operator
If we used the EXCEPT
operator in this example, we would have to join the table back to itself just to get the same number of columns as the original admissions table.
As you see, this just leads to an extra step with code that is harder to understand and less efficient. You almost always want to avoid joining a table back to itself as well.
That being said, the EXCEPT
operator is still useful, just not in those situations where you need to get additional columns beyond the columns you are comparing when using the EXCEPT
operator.
UPDATE: There is a third way to do an anti-join that I cover in the article below. If you already understand how to use the EXISTS
sub-clause, skip to the ‘How to use the EXISTS
Clause for an Anti-Join’ section in the article.
Conclusion
Now you know how and when to use an anti-join. It’s as straightforward as adding a WHERE
clause to your left or right joins 😎
Thanks for reading!
If you enjoyed this read, please connect with me on LinkedIn or check out another story from me:
10 Quick SQL Tips After Writing Daily in SQL for 3 Years
5 Window Function Examples to Take Your SQL Skills to the Next Level
References
- sql-practice.com (2022), SQL Practice.com
- PostgreSQL EXCEPT Operator (2022), PostgreSQL Tutorial