
SQL has been around for decades, and is a go-to language for data analysis and lookups. With the rise of data-related Programming languages such as R and Python, it’s easy to use SQL only for a simple SELECT *
statement and perform the filterings and aggregations later. While tempting, that’s not the best solution.
Today you’ll learn the basics of SQL through a ton of hands-on examples. You’ll need to have a PostgreSQL database installed to follow along.
The article is structured as follows:
- Dataset introduction
- Select data
- Filter data
- Sort data
- Match data
- Join and group data
- Conclusion
Dataset introduction
As briefly mentioned earlier, you’ll need to have the PostgreSQL database installed. You’ll also need the Dvd Rental dataset that you can load to your database via the restore functionality.
The ER diagram of the Dvd rental dataset is shown below:

You’ll use customer and payment tables throughout the article, but feel free to explore the others on your own.
Select data
The most basic operation you’ll do with SQL is selecting the data. It’s done with the SELECT
keyword (not case-sensitive). If you want to grab all columns from a particular table, you can use the SELECT * FROM <table_name>
syntax. Likewise, if you want only specific columns, you can replace the star sign with column names.
Let’s take a look at a couple of examples to grasp a full picture.
Here’s how you can grab all data from the customer table:
The results are shown in the image below:

But what if you only want the data on customer ID, first, and last name? Here’s what you can do:
And here are the results:

It’s always a good practice to specify column names instead of using the star syntax. The table structure might change in the future, so you might end up with more columns than anticipated. Even if that’s not the case, what’s the point of selecting columns you won’t work with?
Filter data
It’s highly likely you won’t need all records from a table. That’s where filtering comes into play. You can filter the result set with the WHERE
keyword. Any condition that has to be met goes after it.
Here’s how you can grab only inactive customers:
The results are shown below:

But what if you want to filter by multiple conditions? You can’t use the WHERE
keyword again. Instead, you can list conditions separated by the AND
keyword.
Here’s how to select all inactive customers from the first store (store_id is 1):
Here are the results:

You can put as many filter conditions after this one; just make sure to separate them with the AND
keyword.
Sort data
Sorting is an essential part of every analysis. Maybe you want to sort users by their registration date, products by the expiration date, or movies by rating – the ORDER BY
keyword has you covered.
Let’s see how you can sort the customers by their respective ID:
The results are shown below:

As you can see, sorting works in ascending order by default. Sometimes you want items ordered from highest to lowest (descending), so you’ll need one additional keyword – DESC
.
Here’s how to do the same sorting but in descending order:
The results are shown in the image below:

And that’s all there is to data sorting.
Match data
Sometimes you don’t know what exactly are you looking for, but you have a rough idea. For example, maybe you know that the name of the customer of interest starts with some letter (or a sequence of letters), but you’re not quite sure.
That’s where matching comes in. In SQL, matching is implemented with the LIKE
keywords. There are multiple ways to do matching, but we’ll cover only the basics.
For example, let’s say you want to see only these customers whose first name starts with "An":
The results are shown below:

You can do matching on different parts of the variable. For example, let’s say you want only those customers whose first name ends with "ne":
Here are the results:

There are more advanced matching operations, like specifying the number of characters before and after, but that’s beyond the scope for today.
Join and group data
It’s improbable that all of the data you need is stored in a single table. More often than not, you’ll have to use joins to combine results from two or more tables. Luckily, that’s easy to do with SQL.
There are many types of joins:
INNER JOIN
– returns the rows that have matching values in both tablesLEFT JOIN
– returns all rows from the left table and only the matched rows from the right tableRIGHT JOIN
– returns all rows from the right table and only the matched rows from the left tableFULL JOIN
– returns all rows when there’s a match in either table
Here’s how you can use joins to combine customer and payment tables and extract the amount paid per transaction:
The results are shown below:

As you can see, there are multiple records for every customer. That’s because a single record represents a single transaction, and a single customer can make multiple transactions.
If you want to find the sum of the amounts for every customer, you’ll have to use the GROUP BY
keyword and an aggregation function. Let’s take a look at an example:
The results are shown below:

So what happened here? Put simply, you’ve made distinct groups from every first and last name (and assumed every customer has a unique name) and calculated the sum per group.
A couple of things can be improved. For example, right now we’re returning all records completely unsorted. The following code snippet orders the rows by the total sum (descending) and keeps only the first five rows:
Here are the results:

You can still improve this result set. For example, let’s say you want to combine the first and last name to a single column named _customername, and you also want to rename the aggregated column to total:
The results are shown below:

Conclusion
Today you’ve learned the basics of SQL in the PostgreSQL environment. SQL is a broad topic, and you can (and should) always learn more. More intermediate and advanced-level guides are coming soon, so stay tuned.
To summarize – do as much data filtering/aggregating in the database. It’s a much faster approach than dragging the entire dataset(s) to the memory and performing the filtering there.
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.
Originally published at https://appsilon.com on January 27, 2021.