Extracting data from various sheets with Python

Or how to learn to unify Google Sheets, Excel, and CSV files — a code-along guide

Fabian Bosler
Towards Data Science
6 min readAug 25, 2019

--

Widespread tabular data storage file formats — CSV, Microsoft Excel, Google Sheets

Python is often called a glue language. This is due to the fact that a plethora of interface libraries and features have been developed over time — driven by its widespread usage and an amazing, extensive open-source community. Those libraries and features give straightforward access to different file formats, but also data sources (databases, webpages, and APIs).

This story focuses on the extraction part of the data. Next week’s story will then dive a little deeper into analyzing the combined data to derive meaningful and exciting insights.

But don’t let that stop you from analyzing the data yourself.

What you will learn:

  • Extracting data from Google Sheets
  • Extracting data from CSV files
  • Extracting data from Excel files

Who is this article for:

  • Python beginners
  • People who have to wrangle data regularly

As this article is intended as a code-along article, you should have your development environment (I recommend Jupyter Notebook/Lab) set up and start a new Notebook. You can find the source code and files here.

If you don’t know how to get going with Jupyter/Python. Check out this guide:

Situation:

In today’s story, I will take you into a fictitious but probably oddly familiar situation. You are to combine data from various sources to create a report or run some analyses.

--

--

EX-Consultant turned tech geek! Business intelligence, marketing, advanced analytics, and machine learning. 👉 https://medium.com/@fabianbosler/membership 👈