Right off the bat, I don’t use and will never use WHERE
clause to join tables.
This article is an attempt to convince you to do the same if you are using WHERE
clause. Instead, use ON
clause to achieve table joins.
Do you perform table joins with WHERE
clauses? Please don’t ask me to review or help you find that frustrating snag in your query.
Are you asking me why?
I told you, I’m no fan of the WHERE
clause used to join tables. I employ it only for data filtering. It rubs me off in a bad way (cringing) when used to achieve table joins. My mind goes into overdrive, particularly if used to achieve multiple joins beyond two tables.
An SQL query like below hurts my eyes, and worst my brain. Using WHERE
clause to achieve table joins takes away clarity and any visual cue on the sequential order the tables are joined together.
See the above SQL query joining 6 tables using a WHERE
clause. To make it worse, it’s also performing data filtering using 3 AND
logical operators with the same WHERE
clause.
A rewrite to use ON
clause instead to achieve the same results looks something like so. The reward is very much noticeable right away.
First, the clutter at the tail of the query where we have used WHERE
clause has been tidied up nicely. It’s obvious now what the WHERE
clause is doing – one thing – only row filtering.
Second, straightforward, you can identify and understand the sequence in which the tables are being joined to create the conceptual mega table holding the query results.
The rewrite looks cohesive as well.
More Reasons To Not Use WHERE But ON Clause for Table Joins
1. Data filtering
WHERE
clause is excellent and more appropriate for data filtering, not joining tables.
2. Cognitive overload
Using WHERE
clause in achieving both data filtering and table joins creates cognitive overload. WHERE
clause affects readability and query comprehension when simultaneously used to achieve both table joins and data filtering.
3. Limited
WHERE
clause limits you on the different table joins achievable. WHERE
clause only achieves INNER JOIN
which is one of 5 types of joins you can perform using the ON
clause.
4. Missing records
Because performing table joins with WHERE
clause technically behaves the same as INNER JOIN
, the records in the tables joined must meet the condition expressed in the WHERE
clause to be part of the result of the joined table. Oversight of this fundamental knowledge might cause missing important records in your query results. This oversight can be costive in a financial app like a banking and remittance app.
Using ON Clause To Achieve More Than INNER JOIN
As I pointed out earlier, using WHERE
clause produces the same results as using ON
clause together with the INNER JOIN
clause.
Beyond the ability to achieve INNER JOIN
using the ON
clause, you can also achieve other forms of SQL table joins. The benefits of these other types of table joins are sufficient arguments to leave WHERE
clauses strictly for data filtering. Doing so also introduces consistency to your queries and reduces SQL review and maintenance headaches for other developers, since they don’t have to struggle to distinguish conditions that are being used for table joins and for data filtering.
For the rest of this article, I will show with sample data, the three other forms of Sql joins, besides the inner join.
If you want to follow along hands-on, then first create the tables and seed with sample data.
Create tables, person
and contact
Download the sample data person.csv and contact.csv and seed the tables person
and contact
respectively.
Unzip and finally seed the tables with the sample data using the queries below. Remember to replace /path/to/person.csv
and /path/to/contact.csv
with the path to your downloaded unzipped copies.
If you did everything right, a select statement on the tables should produce the outputs below
Table person
SELECT * FROM person;
Table contact
SELECT * FROM contact;
The Other Types of Joins Besides INNER JOIN
Besides the INNER JOIN
, there are 4 other types of joins, the FULL OUTER JOIN
, LEFT OUTER JOIN
, RIGHT OUTER JOIN
and CROSS JOIN
.
For this article, we talk about the LEFT OUTER JOIN , RIGHT OUTER JOIN
and FULL OUTER JOIN
LEFT OUTER JOIN
A left outer join combines two tables by listing specified columns from a left table and joining them to columns from a right table if both tables satisfy a condition specified with the ON
clause.
A NULL
value is set for the columns from the right table when the condition is not matched to the column from the left table.
Considering the example below, a person
table is in a left outer join with table contact
.
The table person
is the left table and contact
is the right table.
You notice the columns name
, gender
, age
from the left table person
is being joined to the column phone_number
from the right table contact
. Also obvious is, the last 4 rows without a matching person_id
have the phone_number
set to NULL.
SELECT person.name,
person.gender,
person.age,
person.phone_number
FROM person
LEFT OUTER JOIN contact
ON person.id = contact.person_id
Output:
RIGHT OUTER JOIN
A right outer join combines two tables by listing specified columns from a right table and joining them to columns from a left table if both tables satisfy a condition specified with the ON
clause.
A NULL
value is set for the columns from the left table when the condition is not matched to the column from the left table.
Considering the example below, a person
table in a right outer join with table contact
. The table person
is the left table and contact
is the right table.
You notice the phone_number
from all the rows in the right table is listed and joined to name
, gender
, age
of matching records from the left table.
Because the association between person
table and contact
table is relational, precisely a One2many
relation, you can’t have a contact record without a root entity person to own it. As a result, our RIGHT OUTER JOIN
result is limited to persons with contact records.
A listing of name
, gender
, age
from person
table is joined to the phone_number
from the contact
table if they both meet the condition specified with the ON
clause.
SELECT person.name,
contact.mobile_number,
contact.home_address
FROM person
RIGHT OUTER JOIN contact
ON person.id = contact.person_id
Output:
FULL OUTER JOIN
A full outer join combines two tables by listing columns from both tables. A NULL
value is set for the columns, either the left or right table when the condition is not met.
Considering the example below, a person
table and contact
are combined together by a fuller outer join an ON
condition that the id
column on the person
matches the person_id
column on the contact
table. Rows from each table that don’t meet this condition are still shown in the results have their columns set with NULL
SELECT person.name,
person.gender,
person.age,
person.phone_number
FROM person
FULL OUTER JOIN contact
ON person.id = contact.person_id
Output:
Closing Thoughts
Hopefully, I managed to convince you to abandon using WHERE
clause for joining tables. If nothing at all, you’ve noticed how much clarity using ON
clause gives to your query. Remember also the other types of joins you can perform when use the ON
clause.
Hope you found this article helpful.
Knowledge is for the world, and so we share. Wish you the best. See you next time.