For the curious workers, especially those in the tech industry, exploring a database can bring you to nirvana. This is how I feel when digging through the vast amounts of datasets to seek insightful information for my projects. Using Structured Query Language (SQL), I can easily combine and aggregate data from various sources to answer my questions. This really takes away the pain of downloading various Excel files to then write and copy all the VLOOKUP and IF functions. With my custom SQL scripts, I can create data visualizations using Tableau or PowerBI to share insights with business stakeholders. The key tool is indeed SQL, which I now consider as the new Excel.
To be honest, I was initially scared to learn SQL. It looked like a complex coding language that only computer scientists could understand. However, things are usually not as complicated as they seem. Once I actually tried after learning from my colleague, I really got the hang of it. I started to explore use cases and generate models that have not been previously used. Above all, the flexibility and speed generated from this new skill made work more enjoyable.
In this article, I would like to share my SQL story: the mindset that helped me quickly obtain this skill.
It’s all simple math

Back in high school, fundamental mathematics was my favorite subject. I loved solving quadratic equations and trigonometry problems. It was like solving a puzzle or mystery, like who is x or what would be the distance of point a to b when we know the sinus of o? This mindset brings the excitement that will build your logical thinking.
Both math and SQL are all about logic, you just need to understand the logic to get the expected results. Having embodied a math mindset could really help accelerate my SQL Learning process. This mindset also helped me learn accounting (my undergrad major), economics, and data modelling in Microsoft Excel. The latter is actually the most essential, which I will discuss in the next section.
It’s not that different from Excel
Before working in tech, almost all of my work used Microsoft Excel. I was demanded to create models for cases like price calculation and market share analysis, using functions like VLOOKUP, IF and Pivot tables. Thanks to having a boss that watched me like a hawk and the time to refine my skills, I could learn the logic of Excel and create more efficient models. Having this knowledge was key for me to learn Sql. I found that writing a query is not that different from building an Excel Model. The data warehouse is just like a large shared folder of Excel files and my query is to fill that blank sheet. Below is a simple example that I made for this article.
Let’s say there is a fitness company, let’s call it The Good Place, that provides corporate membership to organizations in both the public and private sector. Every club has these activities/facilities: fitness classes, gym, basketball court, swimming pool, spa and snack bar. There are sets of data: (1) a member list extracted from the client organization’s registration form and (2) an activity log collected each time a member checks in/out using their card. The Good Place would like to further analyze the activity of each client organization.
The first dataset contains the member’s unique ID, full name, company/institution and gender, as shown below.

While the second dataset contains the date_id, member ID, activity and minutes spent, as below.

As you can see, the only shared column between the two datasets is MemberID. Meanwhile, The Good Place’s CEO, Eleanor Shellstrop, wants to know the average minutes spent by each client organization and categorize activities by sport (fitness classes, basketball, swimming, gym) and non-sport (snack bar and spa). This can be done by both Excel and SQL, let’s see how they compare.
Excel
Since there are two separate datasets, it is best to start by adding the columns from the member list to the activity log since the latter is considered a fact table (stores measures and metrics). The prior is a dimension table as it provides details/attributes of the Member ID column.
To create linkage, we would use the VLOOKUP function in the columns after ‘Minutes_Spent’ that would go: VLOOKUP(MemberID, ‘Sheet1_Member_List! A:D, column number, 0). A more convenient yet underused function is INDEX + MATCH, which would be: INDEX(MemberList Array, MATCH(MemberID, ‘Sheet1_Member_List! A:A, 0),MATCH(Column Name, ‘Sheet1_Member_List! A1:D1!, 0). This takes away the need to fill in the column number for the VLOOKUP function.
To add a custom column, we can use the popular IF function combined with OR, that would be IF(OR(Activity = "Snack Bar", Activity="Spa"),"Non-Sport","Sport").
The next step would be aggregating the data to generate the requested insights. There are two methods for this: creating a Pivot Table or using aggregation functions like SUMIFS and COUNTIFS. The prior is much easier, all you need to do is block the data range and insert a Pivot Table. However, it is less sustainable because (1) it adds more memory to the Excel file, making it heavier to operate and (2) you will need to refresh it every time you update the data and change the range if there are additional rows.
Aggregation functions are more favorable if the file will be reused in the long run. To serve Eleanor’s request, we can copy the category and client organization then remove duplicates. Then we calculate the aggregated Total Minutes using SUMIFS and Total Sessions using COUNTIFS since it is non-numerical. The function will be:
Total Minutes : =SUMIFS(‘Combined Dataset.csv’!$D:$D,’Combined Dataset.csv’!$E:$E,A5,’Combined Dataset.csv’!$F:$F,B5)
Total Sessions: =COUNTIFS(‘Combined Dataset.csv’!$E:$E,A5,’Combined Dataset.csv’!$F:$F,B5)
With these two aggregations, we can divide the Total Minutes with Total Sessions to get the Total Minutes/Session. The final output is shown below:

For excel users, this may seem simple. But imagine if there were more client organizations and activity logs plus Eleanor requesting this information on a daily basis? Moreover, what if Eleanor requests even more reports? Much time will be spent making models, downloading various files and copy-ing data, which is also prone to human errors. Now let’s see the SQL method.
SQL
Instead of opening all those files, we can just write a simple query as below:

This will generate the exact output as Excel:

To explain in Excel terms, SELECT is like using an equal sign to summon the needed columns in the main dataset stated in FROM, which for this case is the activity log. Instead of using VLOOKUP, we can type LEFT JOIN then the dataset we want to link (like the column selection in Excel the formula) and ON to set the linking columns (the lookup cell). CASE WHEN is similar to the IF function, just in a different format. For aggregation, you just need to use SUM and COUNT followed by GROUP BY in the end to state the columns used. This is the same as the latter parts in the SUMIFS and COUNTIFS functions where we select the columns and cell to base our aggregation. All the Excel work mentioned in the previous section can be eliminated using SQL.
To keep Eleanor informed, this query can be written in Tableau and create an accessible dashboard. Her analyst can then work on other projects without having to do any manual runs and send emails for this specific topic.
Keep Exploring and Challenging Yourself
I became more fluent in SQL by consistently practicing and accessing new datasets. Having more access stimulates the curious mind to further explore. I always get excited when learning datasets as that can strengthen my analysis. This will lead to new insights that have been overlooked and give a deeper understanding of the problem statement.
Since technology increases work dynamics, I always see the need to challenge myself. The main goal is how to work more efficiently to generate timely and accurate insights. I take time to review my previous queries and think how it can be shorter, with "there is always room for improvement" at the top of mind. There are many possible functions that have yet to be used. SQL engines like Presto have a full list on the website for further learning. Another simple way to learn is to keep Googling and the sought information will come up. It is always important to consistently refine your skills, or in the words of Stephen Covey, sharpen the saw.
Closing Remarks
I strongly encourage that anyone with an analytical role should learn SQL, especially those working in tech. Not only will it accelerate your delivery speed, but also gives you room and flexibility to explore new insights. Moreover, being fluent in SQL will make it easier to move to the next level: Python. From my personal experience, I could quickly learn Python Pandas using a similar mindset: it is all simple math, not that different from SQL, and keep practicing. Above all, we must acknowledge that current development in technology requires new skills. Therefore, we must never stop learning to become a valuable asset to our organization.