The world’s leading publication for data science, AI, and ML professionals.

From Google Sheet to your Jupyter Notebook

How to connect your google sheet to your notebook

If you are not subscribed as a Medium Member, please consider subscribing through my referral.

Jupyter Notebook is a powerful tool for Data scientists. You could perform many complex algorithms with few lines and perform an analysis on a huge amount of data. Jupyter notebook itself capable to be connected with a database system where the data sleep. Although, if we talk about the database, as a data scientist the first thing that comes in mind often is the SQL.

Of course, SQL currently still the main thing if we talk about database but many people who are not coming from programming background would not that familiar with SQL. Especially professional with experience who want to move into programming data science field, often they use google sheet just to put their data.

For that reason, I want to show you how to connect our Jupyter Notebook to our Google Sheets.

Credentials

Connecting Jupyter Notebook to the Google Sheets is not a big hassle; the preparation thou could be a little. Luckily we only have to do it once.

The first thing we need to do is getting an appropriate credential from the Google Developer Console. The credential is asking a request to let us access the google sheets.

In the console, click create project. There you would find the screen shown up similar to the picture below.

It does not matter what project name you give, what important is that you create a new project. To connect into the Google Sheets, we only need this one project so at the very least choose a name you could remember.

Get back into our console and see if the project had been created or not. Select the newly created project and click on the hamburger menu on the top left (The one besides Google APIs symbol). On there select the APIs & Services then select the Dashboard.

On the dashboard, select the ENABLE APIS AND SERVICES button.

From there we would be taken to the APIs Library. It should look like the picture below.

On the search bar, type ‘Google Sheets’ to find the Google Sheets API.

From there click the Google Sheets API and enable the API.

In this step, we also Enable the Google Drive API, so search as well ‘Google Drive API’.

Now we would be taken to the Google Sheets API Dashboard. To use this API, first, we need to create the credential.

Here we would be prompt by another screen to create our credentials. Choose the Google Sheets API for Which API are you using question, Other UI for Where will you be calling the API from question and User data for What data will you be accessing question. Now click the What credentials do I need? button. The steps are shown in the picture below.

If it is your first time creating credentials, you would be prompt to create an OAuth consent screen. Click the Set up Consent Screen to create one.

There, just put any name you like in the application name and click save.

Back to the dashboard, now click the Create credentials button and select the OAuth client ID.

Here, just select the Other type and type any name you like. Then click Create.

You would return to the dashboard. Now download your newly created OAuth Client ID and put it in the folder where you would use your Jupyter Notebook.

This step above is not necessarily needed for connecting, but we did it just in case if we need it later.

We need one more thing. Now back to the create credentials and click the Service account button.

Type any name you like then click the Create button. In the next screen, choose the role as the Service Account User.

Then click create key button.

Choose JSON Key Type and click create. Save the file in the same folder with your intended notebook.

Now we get all the credentials we need. It is a long step but we only need to do it once.

Connecting Google Sheets to Jupyter Notebook

Before we start, we need to install 3 different Python modules.

pip install gspread oauth2client df2gspread

Now we come to the part where we could connect the sheets to the Jupyter Notebook. I would show it in the steps below.

  1. Import all the important modules
#Importing the module
import gspread
from df2gspread import df2gspread as d2g
from oauth2client.service_account import ServiceAccountCredentials
  1. Create a new worksheet in your Google Sheet. Then click the share button on the top right and input the email from the Service Account in your google API dashboard. This to make sure that the Worksheet is connected to our Jupyter Notebook.
  1. Initialize all the important variables
#The scope is always look like this so we did not need to change anything
scope = [
   'https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']
#Name of our Service Account Key
google_key_file = 'service_key.json'
credentials = ServiceAccountCredentials.from_json_keyfile_name(google_key_file, scope)
gc = gspread.authorize(credentials)

Here now we have all the important connections we need.

Upload Data to Google Sheet

First, let’s try to upload our data from Jupyter Notebook to the Google Sheet.

import seaborn as sns
#I would use tips dataset as an example
tips = sns.load_dataset('tips')

Here is our data.

Now there are few things to note when we want to upload our data to the google sheet. We need our Worksheet ID and our sheet name. How to get it is shown in the picture below.

Let’s Upload the data now to this empty Google Sheet.

#This is the Worksheet ID
spreadsheet_key = '1ZJzLxLyfExKC-vziq21JFNzVcCISJVQxa0Dx-55Qc8k'
#This is the sheet name
wks_name = 'test_data'
#We upload the tips data to our Google Sheet. Setting the row_names to False if you did not want the index to be included
d2g.upload(tips, spreadsheet_key, wks_name, credentials=credentials, row_names=False)

And done, now we have our data in the google sheet.

Pull data from Google Sheets

It is as easy as uploading the data to google sheet if we want to pull the data. I would show it in the code below.

#Opening the worksheet by using Worksheet ID
workbook = gc.open_by_key(spreadsheet_key)
#Selecting which sheet to pulling the data
sheet = workbook.worksheet('test_data')
#Pulling the data and transform it to the data frame
values = sheet.get_all_values()
pull_tips = pd.DataFrame(values[1:], columns = values[0])
pull_tips

And that is it. We already connected with our Google Sheet from our Jupyter Notebook.

Conclusion

I have shown you to set up the required steps for us to connect our Jupyter Notebook with the Google Sheets. I also have shown how to upload and pull our data from and to Google Sheets.

I hope it helps!


If you enjoy my content and want to get more in-depth knowledge regarding data or just daily life as a Data Scientist, please consider subscribing to my newsletter here.


Related Articles