Pandas vs SQL in 5 Examples

A comparison of the merge in Pandas and the join in SQL

Soner Yıldırım
Towards Data Science

Photo by James Baltz on Unsplash

Recently, I have written several articles that focus on the comparison of tools and frameworks used in the data science ecosystem such as Pandas vs dplyr, SQL vs NoSQL, Pandas vs SQL, Seaborn vs ggplot2, Seaborn vs Altair, and so on.

In these articles, I focus on how a given task can be accomplished with different tools. I clearly see the differences as well as the similarities between them. Furthermore, it helps to build an intuition about how the creators of such tools approach particular problems.

The focus of this article is to compare Pandas and SQL in terms of the merge and join operations. Pandas is a data analysis and manipulation library for Python. SQL is a programming language used to manage data in relational databases. Both work on tabular data with labelled rows and columns.

The merge function of Pandas combines dataframes based on values in common columns. The same operation is done by joins in SQL. These are very useful operations especially when we have data about an observation (i.e. data point) in different dataframes of tables.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

No responses yet

What are your thoughts?