The world’s leading publication for data science, AI, and ML professionals.

How Do You Join SQL Tables?

WHERE or ON clause?

Right off the bat, I don’t use and will never use WHERE clause to join tables.

This article is an attempt to convince you to do the same if you are using WHERE clause. Instead, use ON clause to achieve table joins.

Do you perform table joins with WHERE clauses? Please don’t ask me to review or help you find that frustrating snag in your query.

Are you asking me why?

I told you, I’m no fan of the WHERE clause used to join tables. I employ it only for data filtering. It rubs me off in a bad way (cringing) when used to achieve table joins. My mind goes into overdrive, particularly if used to achieve multiple joins beyond two tables.

An SQL query like below hurts my eyes, and worst my brain. Using WHERE clause to achieve table joins takes away clarity and any visual cue on the sequential order the tables are joined together.

See the above SQL query joining 6 tables using a WHERE clause. To make it worse, it’s also performing data filtering using 3 AND logical operators with the same WHERE clause.

A rewrite to use ON clause instead to achieve the same results looks something like so. The reward is very much noticeable right away.

First, the clutter at the tail of the query where we have used WHERE clause has been tidied up nicely. It’s obvious now what the WHERE clause is doing – one thing – only row filtering.

Second, straightforward, you can identify and understand the sequence in which the tables are being joined to create the conceptual mega table holding the query results.

The rewrite looks cohesive as well.


More Reasons To Not Use WHERE But ON Clause for Table Joins

1. Data filtering

WHERE clause is excellent and more appropriate for data filtering, not joining tables.

2. Cognitive overload

Using WHERE clause in achieving both data filtering and table joins creates cognitive overload. WHERE clause affects readability and query comprehension when simultaneously used to achieve both table joins and data filtering.

3. Limited

WHERE clause limits you on the different table joins achievable. WHERE clause only achieves INNER JOIN which is one of 5 types of joins you can perform using the ON clause.

4. Missing records

Because performing table joins with WHERE clause technically behaves the same as INNER JOIN, the records in the tables joined must meet the condition expressed in the WHERE clause to be part of the result of the joined table. Oversight of this fundamental knowledge might cause missing important records in your query results. This oversight can be costive in a financial app like a banking and remittance app.


Using ON Clause To Achieve More Than INNER JOIN

As I pointed out earlier, using WHERE clause produces the same results as using ON clause together with the INNER JOIN clause.

Beyond the ability to achieve INNER JOIN using the ON clause, you can also achieve other forms of SQL table joins. The benefits of these other types of table joins are sufficient arguments to leave WHERE clauses strictly for data filtering. Doing so also introduces consistency to your queries and reduces SQL review and maintenance headaches for other developers, since they don’t have to struggle to distinguish conditions that are being used for table joins and for data filtering.


For the rest of this article, I will show with sample data, the three other forms of Sql joins, besides the inner join.

If you want to follow along hands-on, then first create the tables and seed with sample data.

Create tables, person and contact

Download the sample data person.csv and contact.csv and seed the tables person and contactrespectively.

Unzip and finally seed the tables with the sample data using the queries below. Remember to replace /path/to/person.csv and /path/to/contact.csv with the path to your downloaded unzipped copies.

If you did everything right, a select statement on the tables should produce the outputs below

Table person

SELECT * FROM person;

Table contact

SELECT * FROM contact;

The Other Types of Joins Besides INNER JOIN

Besides the INNER JOIN, there are 4 other types of joins, the FULL OUTER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN and CROSS JOIN.

For this article, we talk about the LEFT OUTER JOIN , RIGHT OUTER JOIN and FULL OUTER JOIN

LEFT OUTER JOIN

A left outer join combines two tables by listing specified columns from a left table and joining them to columns from a right table if both tables satisfy a condition specified with the ON clause.

A NULL value is set for the columns from the right table when the condition is not matched to the column from the left table.

Considering the example below, a person table is in a left outer join with table contact .

The table person is the left table and contact is the right table.

You notice the columns name , gender , age from the left table person is being joined to the column phone_number from the right table contact . Also obvious is, the last 4 rows without a matching person_id have the phone_number set to NULL.

SELECT person.name,
       person.gender,
       person.age, 
       person.phone_number
FROM person
LEFT OUTER JOIN contact
ON person.id = contact.person_id

Output:


RIGHT OUTER JOIN

A right outer join combines two tables by listing specified columns from a right table and joining them to columns from a left table if both tables satisfy a condition specified with the ON clause.

A NULL value is set for the columns from the left table when the condition is not matched to the column from the left table.

Considering the example below, a person table in a right outer join with table contact . The table person is the left table and contact is the right table.

You notice the phone_number from all the rows in the right table is listed and joined to name , gender , age of matching records from the left table.

Because the association between person table and contact table is relational, precisely a One2many relation, you can’t have a contact record without a root entity person to own it. As a result, our RIGHT OUTER JOIN result is limited to persons with contact records.

A listing of name , gender , age from person table is joined to the phone_number from the contact table if they both meet the condition specified with the ON clause.

SELECT person.name, 
       contact.mobile_number, 
       contact.home_address
FROM person
RIGHT OUTER JOIN contact
ON person.id = contact.person_id

Output:

FULL OUTER JOIN

A full outer join combines two tables by listing columns from both tables. A NULL value is set for the columns, either the left or right table when the condition is not met.

Considering the example below, a person table and contact are combined together by a fuller outer join an ONcondition that the id column on the personmatches the person_id column on the contact table. Rows from each table that don’t meet this condition are still shown in the results have their columns set with NULL

SELECT person.name,
       person.gender,
       person.age, 
       person.phone_number
FROM person
FULL OUTER JOIN contact
ON person.id = contact.person_id

Output:


Closing Thoughts

Hopefully, I managed to convince you to abandon using WHERE clause for joining tables. If nothing at all, you’ve noticed how much clarity using ON clause gives to your query. Remember also the other types of joins you can perform when use the ON clause.

Hope you found this article helpful.

Knowledge is for the world, and so we share. Wish you the best. See you next time.


Related Articles